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GLOSSARY OF LACIE TERMS 


Biological 


Biowindow 


ABBREVIATIONS AND ACRONYMS 

Stage Specific stages of development of a crop which can he 
recognized hy a major change in plant structure, i.e. , 
emergence after germination, jointing, heading, etc. 
and are represented hy integers on the Robertson Bio- 
metecrological Time Scale. 

A Landsat data acquisition period that is tied to the 
hiostages of wheat development. The LACIE approach is 
based upon the judgment that wheat can he spectrally 
separated adequately from other crops hy anal7,’’sis of 
up to four acquisitions of Landsat data during the 
growing season. The hiowindow opening and closing 
dates may he updated if there is a significant lag or 
advancement in the current crop growth. The sequence 
chosen includes acquisitions during the following 
hiowindows ; 

a. Crop establishment - from planting to the 
hooting stage 

h. Green - from the hooting stage to the heading 
stage 

c. Heading - from the heading stage to the soft 
• dough stage 

d. Mature - from the soft dough stage to the 
harvest stage 


Blind Site 
CCEA 


Classification 


Clas s i f i c at ion 
Error 


Crop Calendar 


Crop Calendar 
Adjustment 

Crop Reporting 
District 


A LACIE sample segment, chosen at random after normal 
analysis, used for testing classification performance 

Center for Climatic and Environmental Assessment, an 
organization of the National Oceanic and Atmospheric 
Administration (NOAA), Colianhia, Missouri 

In computer-aided analysis of remotely sensed data, 
the process of assigning data points to specified 
classes by a testing process in which the spectral 
properties of each unknown data point are compared 
with spectral properties typical of the subject being 
classified 

Classification error is a measure of the degree to 
which the LACIE Classification and Mensuration Sub- 
system (CAMS) can estimate the wheat area in one or 
more LACIE samples . 

A calendar depicting the growth-development or bio- 
logical stages of the major crop types within a speci- 
fied region. 

An adjustment made, on the basis of current weather, 
to the normal crop calendar 

A geographical area used by the U. S. Department of 
Agriculture for the collection and reporting of 
agricviltural’ information. Each district consists of 
several counties. 


Goddard Space Flight Center, a NASA installation in 
Greenhelt, Maryland | 

Intensive Test Sites; U.S. and Canadian locations in 
which detailed crop information is collected by using 
ground and airborne equipment 


The Lyndon B. Johnson Space Center, a NASA installa- 
tion in Houston, Texas 


LACIE 


Large Area Crop Inventory Experiment 


Landsat 


Formerly the Earth Eesources Technology Satellite (ERTS) 
This earth-observing satellite operates in a circular, 
sun-synchronous, near-polar orbit at an altitude of 
approximately 915 km (^+9^ n.mi.). It orbits the earth 
lU times a day and views the same scene everj'lS days. 


Landsat Data Set The electronic or film products prod\iced for a partic- 
ular acquisition of a sample segment 


Landsat Scene 


The collection of the image data of one noiainal fram- 
ing area (l 85 km square) of the earth's surface; this 
includes data from each of four spectral bands or 
channels on the satellite multispectral scanner. 


Mensuration 


The act of measuring, in the case of LACIE, measuring 
surface area in a particular crop 


Multispectral 


Pertaining to radiation from several discrete bands of 
the electromagnetic spectrum 




Multispectral 
Scanner or MSS 

Mult it empor al 
Analysis 

NASA 

n.mi, 

NOAA 

* Nonsupervised 
Classification 

Pixel 

R&D 


Multispectral scanner system sometimes referred to 
simply as the multispectral scanner is the remote 
sensing instrument on Landsat that measures reflected 
sunlight in various spectral hands or wavelengths . 

Analysis of data sets over the same area acquired at 
different times. 

National Aeronautics and Space Administration 

Nautical mile. Equivalent to l/ 60 ^ at the earth 
equator, or approximately 1852 meters ( 60 T 6 ft.) 

National Oceanic and Atmospheric Administration of the 
U.S. Department of Commerce. 

A procedure hy -which multispectral data are grouped 
into spectrally similar clusters. 

Ficuure element •, refers to one instantaneous field of 
-view (IFOV) as recorded hy the multispectral scanner 
system. On the Landsat system it is equivalent to 
approximately 0. 44 hectare (l. 09 acres). One Landsat 

. . ■ ; g.;, 

frame contains approximately 7.36 x 10 pixels. 
Research and Development 
Research, Test, and Evaluation 


RT&E 


Sample Segment 


Sampling Error 


Scene 

Registration 

Signature 

Extension 

SRS 

Supersite 


Supervised 

Classification 


A 5x6 n.mi. area selected by a stratified random sampling. 
Information on this area is recorded by the multispec- 
tral scanner and transformed into computer compatible 
tapes and film products. 

A measure of the degree to which the wheat area in the 
LACIE sample segments represents the wheat area con- 
tained in the survey region being sampled 

The process of superimposing points on two data sets 
taken at different times 

The analysis process using the spectral characteristics 
or "signature” of one sample segment to perform the 
classification on another sample segment 

Statistical Reporting Service, an agency of the U.S. 
Department of Agricultiire 

A particular intensive test site for which additional 
ground data, such as radiation measurements, are. 
acquired. Currently, there are' three supersites : 

Williams County, N.D., Hand County, S.D., and Finney 
County , Kansas 

A procedure used in data processing in which remotely 
sensed data of known classes are used to establish 
the decision logic from which unclassified data are 
assigned to classes. 


X 


Test Field 


Training 


USDA 


The spatial sample of digital data of a known ground 
feature selected by the investigator which is used to 
validate the statistical parameters generated from 
training field samples. 

Field The spatial sample of digital data of a known ground 
feature selected by the analyst, from which the spec- 
tral characteristics are computed for use in supervised 
multispectral classification of remotely sensed data. 
The statistics associated with training fields provide 
the input to "train" the computer to discriminate 
between different classes in the scene. 

United States Department of Agriculture 


WMO 


World Meteorological Organization 


SECTIOW 1.0 
INTRODUCTION 


1.1 GENERAL 

The piorpose of this report is to provide senior managers in par- 
ticipating LACIE agencies with an evaluation of the experiment. 
While the main thrust of the report is the evaluation, a brief 
synopsis of actual achievements is provided as a basis for 
the ' evaluation. 

The Large Area Crop Inventory Experiment (LACIE) is a cooperative 
project of the U.S. Department of Agriculture (USDA), the National 
Aeronautics and Space Administration (NASA) , and the National 
Oceanic and Atmospheric Administration (NOAA) of the U.S. Depart- 
ment of Commerce. The major goals of LACIE are: 

1. Evaluate and demonstrate the capability of existing tech- 
nology (remote sensing, data processing and analysis, and 
other associated technologies ) to make improved worldwide 
crop-production information available to decision makers in 
a cost-effective manner; this test of technology is to be 
conducted in a q.uasi-operational environment. 

2. Research and develop alternate approaches and techniques 
which, upon evaluation, are q,ualified to be incorporated 
into the LACIE quasi-operational system where required to 
meet performance goals or to improve efficiency. 

The experiment will span approximately 3-1/2 years , and will 
progress from Phase I, which concentrated on a system test to 


dete.rmine wheat areal extent within selected wheat growing regions 
1 

of the U.S, 5 recognition analyses in selected other areas, and 
yield model development and yield feasibility determinations oyer 
selected regions in the U.S.; throiigh Phases II and III, which 
will test LACIE capabilities to develop area, yield, and produc- 
tion estimates for other major wheat-producing areas of the world 
in a quasi-operational mode. 

Evaluation reports are scheduled at the completion of each of the 
three phases of LACIE* These reports are intended to provide 
executive-level managers of the participating agencies with 
information to support decisions related to future agency com- 
mitments and also to evaluate how well the objectives are met 
during the period covered by the report. 

The intent in this report is to document the results of Phase I 
of MGIE. Results on the accuracy of the estimates are treated 
in summary fashion in the body of the report, and in more detail 
in the appendices. 

The scope of this report represents the progress during Phase I 
of LACIE. However, to present a complete synopsis of activity to 
date, brief mention is made of key events before the initiation 
of Phase I of the experiment. 


^LACIE is designed to meet USDA needs in areas where ground informa- 
tion is not readily available . To test the design in an area where com- 
parison information is available, the U.S. (Great Plains) has been chosen. 
LACIE is not designed to improve the accuracy of U.S. crop reports. : 




BACKGROUND 


The need for crop inventory information was stated by the U.S. 
Department of Agriciilture (USDA) as follows: 

"To permit rational decisions in areas such as production, 
marketing, transportation, and international trade, we must 
have up-to-date, accurate information on world food supplies 
and world food needs. The Department of Agriculture has been 
assigned the responsibility for collecting and reporting crop 
production information to the public." 

In anticipation of helping to fulfill information needs such as 
stated above, the remote sensing community has for several years 
been developing a key part of a new technology for conducting 
large-scale crop inventories. 

Some of the major events in the development and application of 
this technology were £is follows: 


Late Surveys of agricultural terrain by black and white 

1950's aerial photography using camouflage detection film 
(reflective infrared) 


Early Development of airborne multispectral scanners and 

1960's large-scale digital-processing techniques 


From a presentation by Clayton K.Yeutter, Assistant Secretary for 
International Affairs and Commodity Programs, U.S. Dept, of Agriciilture , 
to the Committee on Science and Technology, U.S. House of Representatives 

February 4 , 1975. 
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1966 First computer-aided classification of wheat and other 

crops using airborne multispectral scanner data 

1969 Apollo multiband camera experiment (S-O65) simulating 

Landsat spectral bands. First computer-aided classi- 
fication of wheat and other crops using satellite data 

1971 Corn Blight Watch Experiment, first large area agri- 
cultural effort; used both image analysis and con^uter- 
aided analysis of airborne multispectral scanner data 

1972 Landsat 1 launched; the start of many agriculturally- 
oriented investigations by Landsat scientific investiga- 
tors, including several by representatives of the USDA 
and NASA and one Joint project on crop identification 


There had been acceptable progress in the development of tech- 
niques for the analysis of satellite-acquired multispectral data 
for the purpose of indentification and measurement of wheat areas . 
This capability to identify and measure wheat area provided, 
however, only one component for the estimation of wheat produc- 
tion. For USDA crop-reporting purposes, production ( i . e . , area 
in wheat multiplied by yield for that area) is the quantity of 
primary interest. Although there is an expectation that satellite 
multispectral observations will contribute to yield determination 
at some future date, this technology was not sufficiently developed 
to be included in the LACIE mainstream program. An alternate 
approach, however, using meteorological data (from ground stations 
and/or satellites) in yield models was in the course Of develop- 
ment and was considered the most promising for supporting initial 
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Interest in pursuing inventory techniques was intensified by grain- 
production shortfalls in some areas of the world in 1972 and 1973 
and by an increase in consumption during those years. This 
interest spurred planning activity in NASA, USDA, and NOAA, and 
the time was judged appropriate for a large-scale experiment to 
validate the technology as applied to a crop- inventory system. 

This technology had been previously tested only in local situa- 
tions and with very limited amounts of data. Wheat was chosen as 
the crop for the initial experiment, and a preliminary project plan 
was developed in the fall of 1973. 

An interagency Memoranda of Understanding (MOU) was drafted and 
detailed planning was carried out thrbu^ the summer of 197^ with 
coordination among the three agencies. The general shape of the 
eaqperiment was essentially defined by the middle of 197 ^ and all 
agencies began staffing the activity by the fall of 197^. An over- 
all schedule for the project was approved in early November 197^. 

The activity was annoiinced November 6, 197^ s and was described 

briefly by Secretary of State Kissinger at the World Food Gon- 

' ■ 3 '' ■ 

ference in November 197 h as follows : 

"Our space , agriculture, and weather agencies will test 
advanced satellite techniques for surveying and fore- 
casting important food crops . We will begin in North 
America and then broaden the project to other parts of 
the world. To supplement the World Meteorology Organi- 
zation (WMO) on climate, we have begun our own analysis 


3 

From a speech by Henry F. Kissinger, Secretary of State of the 
United States of America, in Rome, Italy, November U, 197^. 


of the relationship between climate patterns and crop 
yields over a statistically significant period. This 
is a promising and potentially vital contribution to 
rational planning of global production." 

TECHNICAL DESCRIPTION 

The objective of the LACIE is to estimate production of wheat on 
a count ry-by- count ly basis. To estimate wheat production on a 
country basis, the country is subdivided into subareas called 
strata, w.here yield (quintal/hectare or bushel/acre) and the 
prevalence of wheat planted are rather uniform. Yield and 
the areal extent of wheat within each strata are determined by 
independent methods and then multiplied together to obtain wheat 
production (quintals or bushels) for the stratum. The production 
estimates in each stratum are then added to obtain production at 
other geographic levels. In addition, area and yield are esti- 
mated for each stratum and aggregated to determine wheat area 
and yield at regional and country levels. 

Area is derived by classification and mensuration of Landsat 
Multispectral Scanner (MSS) data acquired on a sampling of about 
2 percent of the agricultural area in all regions where wheat 
i 3 a major crop. Maximum use is made of computer-aided analysis 
to provide the most timely estimates possible. 

Yield is estimated from statistical models which relate crop 
yield to local meteorological conditions, notably precipitation 
and temperature. Initially, these data are being obtained from 
the World Meteorological Network of grotmd stations . As the 
experiment progresses, use of supplemental meteorological dat,a 
from NOAA environmental satellites is planned. 


The project has involved the assembly of a crop- inventory system 
from available components designed for Research and Development 
(R&D). That is, the system is intended to test the functions 
necessary for crop inventory not to provide a streamlined, cost- 
effective operational tool. The intent is to utilize the experi- 
ence gained to support, as a concurrent effort, the design of a 
user-oriented operational system and the prediction of the per- 
formance and cost of such a user system. 

LACIE will extend over three global crop seasons, each of which 

is considered a LACIE phase. The early phases will concentrate 

primarily on the most important wheat-growing region of the U.S., 

the hard red wheat region in the U.S. Great Plains. This region 

k 

comprises 9 states which account for, typically, 90 percent 
of the hard red wheat and T5 percent of the total U«S, wheat. 

Then the experiment will be extended to include the major wheat 
producing regions of the world. These three phases overlap 
because they are based upon global crop-growing seasons . The 
first phase — covered in this report — began in November 197^ 
and was devoted primarily to the evaluation of the ability to 
locate, identify, and estimate the area of wheat in the Great 
Plains of the U*S. Data from the USDA Statistical Reporting 
Service were used as a reference from which to determine the 
accuracy of LACIE performance. Also during this phase, develop- 
ment and feasibility testing of wheat yield models was conducted. 
In Phase II , the major area of coverage remains the U.S . Great 
Plains; however, Canada will be Included, and selected regions 
outside North America will be analyzed. Phase II extends from 

li 

Texas, Oklahoma, Kansas, Nebraska, Colorado, North Dakota, South 
Dakota, Montana, and M 


October 1975 through April 1977 and involves an integrated test 
of the crop identification and area estimation capability along 
with use of the yield models to predict wheat production in the 
regions being studied. In Phase III, the LAGIE capability 
should be able to support the estimation of wheat area, yield, 
and production in several countries, shotild such a scope be 
decided upon by the participating agencies. The current LACIE 
schedule is shoTO in figure 1-1. 
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SECTION 2.0 

EVALUATION OVERVIEW OF PHASE I 

2.1 GENERAL 

Phase I of LACIE was a period of 'bringing system components into 
operation and testing their ability to meet experiment goals. 

Area estimation was performed in a q.uasi-operational mode^ 
yield and production estimation in a feasibility test mode. Per- 
formance during Phase I of LACIE was very encouraging and 

Table 2-1 summarizes Phase I goals and accomplishments. I 

An overall experiment design was completed (hardware, software, 
sample design, etc.) to support all three phases of LACIE and 
the Phase I system was exercised successfully . The initial 
quasi-operational system for area estimation was implemented 
and began operation on sched-ule. Yield and production esti- 
mates during Phase I were made throu^out the phase but in a | 

test and evaluation mode . Reports on area for the U.S. Great j 

Plains were prepared monthly throu^out the growing season. A 
single summary report on each of yield and production was devel- 
oped at the end of the phase. 

The accuracy per fbrmance of the LACIE estimates, based on a 
number of tests in the U.S. Great Plains, is considered margin- 
ally satisfactory in consideration of the 90/90 ”at-harvest” 

’*■ criterion for wheat production estimation. This criterion 

specifies that at-harvest production estimates at a country 
level be 90 percent accurate 9 years out of 10 or 90 percent 
■ ■ of the time. ■ : , : i: : 

10 .' ■ ■ 


I 


TABLE 2 - 1 PHASE I 





Goals 


Develop a system ibo test the components of the 
LACIE technology 


Conduct tests of the 80*ea-estimation capability 
over selected area within the U • S . Great Plains 
(the "Yardstick” region) 

Evaluate the feasibility of wheat classification 
over representative foreign locations for 
Phase li and Phase III 


Conduct feasibility tests of the yield and 
production estimation capability 


Evaluate performance for accuracy , timeliness , 
and utility 

Modi^ the technology as required for Phase II 


Conduct parallel and supportive research , test 
and evaluation to investigate improved 
approaches 

Implement the additional components of the 
system required to support quasi-operational 
yield and productibn estimates in Phase II. 


r 



AND ACCOMPLISHMENTS 




Accomplishments 


An overall experiment design was completed (hardware, 
software, sample design, etc.) to support all three 
phases and the Phase I system was exercised success- 
fully. 

Tests successfully conducted for the nine states 
selected (the U.S. Great Plains). 


Tests conducted over segments in all LACIE countries. 
Experienced difficulties in some countries with smetLl 
fields and with cloud cover in some cases. In other 
cases classification could be easier due to large 
fields and more Tinifoim agriculture than in U.S. 

Yield models for U.S. Great Plains checked historically 
over a 10 -year period, production tested for 1975 * 

Basic approach is adequate. Some improvements will be 
required. 

Accuracy of results assessed by USDA as generally 
satis feu: tory. Timeliness and utility to be evaluated 
during Phase II . 

Area-estimation technology revised and yardstick area 
reprocessed; areas for yield model improvement iden- 
tified and some improvements i&plemented . Phase II 
initiated as planned. 

Phase I program produced several improvements to tech- 
nology approach. These are being incorporated into 
Phase II and Phase III. 

Components were successftilly implemented and are being 
exercised in Phase II. 





Tests were also conducted over segnents in all LACIE countries 
planned for Phases II and III. The results of this testing 
showed some regions for which area and yield estimation will 
be more difficult than in the U.S., the main factors being 
small field sizes, increased cloud cover, and poorer historic 
data. In other cases, however, area estimation appears easier 
as a result of larger field sizes and more uniform agriculture 
in regions such as the USSR. 

As a result of Phase I experience, several problems were 
uncovered in the technology. The LACIE research, test, and eval- 
uation program produced several improved technology approaches 
which were or are being implemented for Phases II and III. 

AREA ESTIMATION 

After correction of significant implementation problems in the 
initial quasi-operational area estimation system, the result- 
ing wheat area estimation at harvest, based on its performance 
quantified over the U.S. Great Plains, was deemed marginally 
satisfactory in consideration of the 90/90 at-harvest criteri- 
on for wheat production estimation. The area estimation sys- 
tem shows a tendence to underestimate idien con^iared to the SRS 
estimates. The LACIE Great Plains area estimate was approxi- 
mately lt6,000,000 acres compared to the SRS^ estimate of approxL 
mately 51»000,000 acres, or about 10 percent below the SRS fig- 
ure. Analyses show this difference to be statistically 
significant. 



A significant contribution to this underestimate is believed 
to be a sampling problem in North Dakota. An improved alloca- 
tion of samples on the basis of a better partitioning of 
agricultural lands into more homogeneous strata is expected to 
reduce any bias to a tolerable level. The use of full-frame 
Landsat imagery is critical to defining adeq.uate strata to 
avoid such sampling error; this improved sample allocation is 
currently planned to be tested in Phase III. The coefficient 
of variation (c.v.) computed for the LACIE area estimator, when 
projected to the U.S. national level, is about 5«0 percent, 
sli^tly above the U.25 percent req.uired if production esti- 
mates are to meet the 90/90 criterion. Because data loss due 
to cloud cover and early implementation problems resulted 
in a reduction in the number of LAGIE sample segments used 
(of Ull allocated, 380 were acq^uired and 272 were vised) , this 
random error component can very likely be reduced to or below 
the acceptable limit of 4.25 percent by the improvements 
implemented and planned for Phases II and III. 

The results of this quasi-operational test for area were fur- 
ther examined in the Phase I production feasibility test where 
the LACIE area estimates were combined with LACIE Yield esti- 
mates and resulting production estimates evaluated. This pro- 
duction estimate satisfied the 90/90 criterion and indicated 
the basic compatibility of the LACIE area and yield estimators. 

Accuracy was also examined for selected sample segments and the 
results indicate that the Landsat data and the classification 
technology can estimate the small grains (i,e,, wheat and closely 
associated small grains) area within a sample segment accu- 
rately and reliably enough to meet the LACIE goals. The 
LACIE estimates in the segnents agree well with independent 


estimates from grovind and adrcraft observations . In North 
Dakota, where 20 such sites were examined, no significant 
difference was detected between the LACIE and groiond observa- 
tions over the sample segnents. The estimated c.v. of the ran- 
dom classification error was "acceptably" small. These analy- 
ses confirmed that bias introduced by various factors such as 
Landsat spatial resolution, lack of spectral resolution, clas- 
sifier (analyst interpreter) bias and repeatability, etc. , is 
not excessive , in terms of the required performance criterion. 

Results of these tests did indicate a diffic-ulty in differen- 
tiating wheat from other closely related small grains . How- 
ever, wheat area estimates were obtained through the reduction 
of the small grain area estimates in accordance with the his- 
toric prevalence of these crops. 

There are some indications that in regions that have marginal 
wheat production, small fields, or large amounts of confusion 
crops; wheat identification may be more difficult than in 
hi^er producing areas, LACIE plans to monitor these situations 
closely during Phases II and III . 

The several approaches taken to estimate sample error indicate 
that for the U.S. Great Plains it is acceptably small given all 
the allocated segnents. Loss of acquisitions from cloud cover 
was a problem in Phase I; however, tests conducted to date 
indicate that error arising from this loss is probably randomi in 
nature with no significant bias being introduced. 

In North Dakota, a significant underestimate of the wheat area 
was observed. Further analyses indicate the major problem is 


with the sample placement as opposed to the classification 
analysis. Indicated solutions are the allocation of additional 
samples or improved stratification to reduce agricultural 
area variability or both. 

2.3 YIELD ESTIMATION 

2 

The Phase I testing of the yield models indicated that the 
models can be expected to support the 90/90 criterion in 
regions having characteristics sinu.lar (in geography and agricul- 
ture) to the states in the yardstick region. It is recognized, 
however, that the nrodels may not perform as well in foreign 
areas where historical record data are lacking or nonexistent. 

In a test of the yield models over the years I 965 to 1975* 
the c.v. of the yield estimates was on the order of 2 percent 
at the national level, lower than the 4.25 percent required. 

When combined with SRS area estimates in these same years , the 
yield estimates would not satisfy the 90/90 criterion for pi-o- 
duction given errors of equal magnitude in the area estimates. 
However, it was noted that a source of the yield estimation 
error was the form of the model which resulted in unrealistically 
hi^ or low yield estimates for extremely high or low values 
of the teraperature or precipitation. An improved model has 
been developed. Tests of this improved model indLcate that it 

estimates and meet the criterion. 


^hese models will be incorporated into the LACIE quasi- 
operational system in Phase II. 


PRODUCTION ESTIMATION 


2.h 


When the LACIE area estimates and the LACIE yield estimates are 
combined, the resisting production estimates satisfy the 90/90 
criterion. In the Great Plains, the LACIE production estimate 
was 8.8 percent below the SRS final estimate for the same 
region. The c.v. of the LACIE production estimate was 5.3 per- 
cent at the Great Plains level and h.2 percent when projected 
to the national level. This is within the acceptable tolerance 
of 6 percent for an unbiased estimation. Because the differ- 
ence between the SRS and LACIE estimate at the Great Plains 
level is not significant (i.e., could likely be a random fluc- 
tuation in this statistical quantity) , the estimator can be 
judged to satisfy the 90/90 criterion because the c.v. is less 
than the 6 percent required. The largest regional problem 
observed is once again in North Dakota where production is sig- 
nificantly underestimated because of the area estimation dis- 
cussed earlier. 


2.5 RATE OF ANALYSIS OF LANDSAT DATA 

The performance goal for the rate of analysis of Lands at data 
was to be able to process between 15 and 20 segments per work- 
ing day and to complete a segment in a timely fashion such 
that , in a truly production operation, data from the satellite 
would be analyzed and available for aggregation within 1^ 
days of acquisition. By the end of Phase I, the volume of - 
data being analyzed was meeting Phase I goals (fig. 2-l) . It 
was determined that actual demonstration of a lU-day turnaround 


■^This is for the original yield model . When the revised model is 
used the corresponding difference is -5 . 6 percent (see Appendix A ) • 


SEGMENTS PROCESSED PER DAY 



Figure 2-1.- Rate of processing Landsat data. 


was not necessary if it could be sliown that this turnaround 
could have been attained in a three-shift production operation. 
During Phase I, there was a number of conditions typical of the 
start of an operation which led to backlogging of data; hence, 
the tumarotmd time was long in comparison with the goal. 

I'Then actual time in process was considered, then the turnaround 
time was 30 to 31 days. This should be compared to a target 
time of 29 days (which corresponds to the lU-day goal 
when adjustments are made for the number of shifts employed). 

There are known areas where further improvement can be real- 
ized; these have been analyzed and improvements are being 
incorporated. 

SUMMARY MD OPEN ISSUES 

There is considerable confidence from the Phase I results that 
LACIE will meet its Phase II and III accuracy goals in the U.S. 
Because some degradation in performance is to be expected when 
expanding to some foreign areas, it is vital to reach or exceed 
the acc-uracy goals in the yardstick area. In addition, the 
following significant open issues exist in area estimation ; 

A. Technical problems are involved in distinguishing between wheat 
and other small grains. Implicit in these problems is the 
questions of how important is this capability. This is 
being addressed in Phase II, Two approaches are being evalu- 
ated: (l) making an estimate for small grains as a class, 

and (2) ratioing techniques utilizing historical data on 
the prevalence of wheat to develop an estimate for wheat 
from the small grains estimate. 


B. Signature extension — technology available at the start of 
Phase I was inadequate and was removed from use. Substan- 
tial research efforts have been directed toward the various 
technical aspects of this problem during Phase I. Prom- 
ising approaches are being tested in Phase II for incor- 
poration in Phase III. 

C. Multitemporal analysis techniques — - technical problems pre- 
cluded the full use of these techniques early in Phase I; 
however, the problems were remedied, and successful use of 
multitemporal analysis was made in Phase I. 

D. Partitioning of the LACIE survey regions into areas of sim- 
ilar agrophysical properties needs to be greatly improved. 

It remains an open question as to how effectively data 
such as soils maps, climatology, topographic data, and 
Landsat full-frame imagery can be used to develop improved 
partitions. Such partitions are important for improvements 
in sampling, use of ancillary data, development of inters 
preter keys for Landsat data analysis, signature extension, 
and yield modeling. 

In the yield estimation activity, it is clear that improved 
models are both desirable and possible . Approaches to relate 
the models more closely to actual plant growth conditions are 
underway and refined models will be tested in Phase III . 

In conclusion. Phase I was a successful step in LACIE , con- 
sidering the complexity of the undertaking. No fundamental 
changes were required in the experiment approach or schedule . 

The technological problems and startup difficiolties encountered 
during Phase I were generally anticipated. It is considered 




SECTION 3.0 

SUMMARY OF PHASE I TECHNICAL ACTIVITY 


OBJECTIVES 

A detailed statement of the experiment objectives is given in 

the LACIE Project Plan. Briefly, the major goals to be accom- 
plished by the end of Phase I were the following: 

A. Select the most promising technology components to (l) iden- 
tify wheat and estimate its area, (2) estimate yield, and (3) 
estimate production. 

B. Complete an overall experiment design (hardware, software, 
sample design) required to support all three phases. 

C. . Implement that part of the analysis system required to esti- 

mate wheat area over most of the hard red wheat region of the 
United States (the Great Plains ) . 

D. Develop procedures for handling and analyzing large quantities 
of data required in LACIE to meet the planned expansion into 
foreign areas . 

E. Select and train personnel from the three participating agen- 
cies to implement, operate, and evaluate the LACIE system. 

F. Exercise the system in a quasi-operational manner and esti- 
mate wheat area over the U.S, hard red wheat region and 
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evaluate^ the results, both against established performance 
criteria for at-harvest estimates and to determine how accu- 
rately early season estimates can be made. 

G. Test selected methods for estimating wheat yield and produc- 
tion prior to implementation of this capability for Phase II. 

H. Conduct parallel and supportive research, test, and evaluation 
to investigate improved approaches. 

I. Conduct initial analyses over selected foreign areas and areas 
in the United States outside the Great Plains yardstick area 

* prior to expansion in Phase II. 

J. Develop and implement evaluation plans for subsequent phases 
(II and III). 

K. Implement the additional components of the system required to 
support making quasi-operational yield and production esti- 
mates in Phase II. 


^The "90/90” criterion was selected as a goal. This means 90 percent 
accurate, at-harvest, hy the end of the experiment (in comparison with 
the true value) 90 percent of the time. As a practical matter, the hest 
available yardstick value is used for comparison. In the U.S. , these 
are Statistical Reporting Service (SRS) results; while no specific accuracy 
goal exists for estimates prior to harvest, reports are issued on a regular 
basis. 
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ACTIVITIES AND ACHIEVE3VIENTS 


The activities and achievements described in this section repre- 
sent the highlights of Phase I in the light of which the evalua- 
tion in section h.O is made. The major achievements and results 
to date are the following: 

2.1 Area Estimation 

A. An existing data system at the Goddard Space Flight Center 
(GSPC) was modified with both software and hardware additions 
to screen LACIE segments from the overall digital data 
acquired by Landsat, conduct a temporal registration, format 
the data, and transmit them to the Johnson Space Center (JSC). 
Data acquisition and processing started as scheduled in 
November 197^- 

B. An existing data analysis systan, at JSC was modified to pro- 
vide an interim LACIE system to analyze LACIE- formatted data 
in the early part of Phase I (November 197^ through March 
1975 ), and analysis was started as scheduled in November 197^. 

C. The first data analysis system (LACIE 2 Automatic Data Proc- 
essing (ADP) system) responsive to the LACIE requirements 
for multi spectral data classification was delivered in April 
1975 on schediale. It was put into operation smoothly and 
used for analysis of the bulk of the Phase I data. 

D. Landsat 1 data over Kansas from the 1973-7^ crop year were 
edited retrospectively from archived data and transmitted to 
JSC in LACIE format. These data were analyzed during the 
period from November 197^ through January 1975, using the 
interim LACIE data system and interim classification procedures. 
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The data sets were all for a single date ; i . e . , no mnlti- 
temporal analysis was employed. Comparisons were made with 
the USDA statistical reporting Service (SRS) state estimate 
and with ground truth data acq,uired hy ASGS on intensive test 
sites. A relative difference from the SRS data of -3 per- 
cent was noted with a coefficient of variation of 6 percent. 

E. A sampling strategy was developed to acquire Landsat data for 
the yardstick area (U.S. Great Plains) and for foreign explor- 
atory areas. To provide data for a full crop year (19T^-19T5) 
of winter wheat activity in the U.S, Great Plains, hoth Landsat 
1 and Landsat 2 were required. Landsat 1 data were retrieved 
from archives for analysis of fall acquisitions of winter wheat 
segments. These data were analyzed using the interim LACIE 
data system during the period January-March 1975- 

F. Landsat 2 data acquisition Was initiated shortly after latonch 
(January 1975) as crop development proceeded (i.e., as hio- 
windows opened up) . 

G. The LACIE system for analysis of Landsat acquired data seg- 
ments operated at increasing throughput rates and, toward 
the end of Phase I, reached a rate of just over 15 segments 
per day. This compares favorably with the planned peak 
delivery rate in the range of 15 to 20 segments per day. 
Initially , the throughput rates for these data were limited 
hy a multitude of operational and logistic problems, most of 
which were subsequently resolved. 

H. Ifodels for making seasonal adjustments to the crop calendars 
for the U.S. Great Plains were implemented at the NOAA Center 
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for Climatic and Environmental Assessment (GGM) a.nd commenced 
operation in April 1975 at GGEA with results transmitted to the 
JSC. 

I. Provisions for gathering meteorological data for use hy clas- 
sification analysts were implemented by NOAA. These data were 
extracted from various ground sources such as the WMO network 
and compiled by NOAA staff at JSC. This activity commenced 

in April 19T5- During Phase I, the utilization of NOAA 
satellite imagery was also initiated to increase the informa- 
tion flow to the classification analysts. This use was pri- 
marily to explore, from the satellite imagery, the cause and 
extent of anomalous situations. 

J. An interim capability to aggregate segment results to provide 
area estimates was implemented in April 1975- Area aggrega- 
tions for the U.S. Great Plains were completed from April 
through August 1975* 

K. The initial analysis of a ma;)or portion of the Phase I data 
for the U.S. Great Plains was essentially completed by late 
July 1975. The results showed area estimates, substantially 
higher than the SRS results for most states . Results were 
better for winter wheat states than for spring and mixed 
spring and winter wheat states and, on a segment basis , better 
for areas in which wheat is common than for areas in which it 

'.■:;is sparse.., : : ' 

L. The high estimates were unsatisfactory and prompted the 
initiation of a close review of the area-estimation tech- 
nology in early August. This review had broad participation 
from the remote sensing community and confirmed that 
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incorrectly implemented procedures existed in the original 
analysis approach. For example, "because of the great impor- 
tance of an early estimate, an attempt vas made to arrive at 
an estimate using fall data which showed little wheat emerged. 
Areas of seed bed preparation were accordingly classified as 
"potential wheat" and included in area aggregations. Since 
seed bed preparations are made for other reasons, this led 
to a significant overestimate. 

The identified problem areas led to a revision of the analysis 
procedures and to the initiation of an effort to reanalyze the 
U.S. Great Plains regions in order to evaluate the modified 
procedures . 

The rework effort was completed in November 1975 , and gave 
area-estimation results which indicated that, at a national 
level, estimates woTold be within 10 percent of the SRS results. 

A significant discrepancy in North Dakota estimates was identified 

During Phase I, a total of 693 segments were studied. In 
the U. S . Great Plains , an average of 2. 3 Landsat images was 
acquired for each segment, following the practice of utilizing 
the first good acquisition in each of the four biowindows. 
Gloud-cover eonditions acGOunted for almost all the missed 
dat Qt ■ 

Area, yield, and production aggregations (Appendix A) vere 
conducted over the U.S. Great Plains (Texas, Oklahoma, Kansas , 
Nebraska , Colorado , North Dakota, South Dakota, Montana, and 
Minnesota). Results indicate the relative difference (bias ) 
of the lACIE North Dakota area estimate to be the major com- 
ponent of the relative difference in the production estimate. 


Classification tests were conducted on 2QJ exploratory seg- 
ments distributed among the other LACIE countries. 

Accuracy assessment activities were initiated in July 1975 
and tests were conducted using segments where ground truth 
data were available (from 29 Intensive Test Sites (iTS) and 
from 28 "blind sites" where data were gathered after the 
analysis). Some 3^0 special analyses were conducted to sup- 
port the accuracy assessment. Basically these were special 
tests to study the source and nat-ure of classification errors. 
In this accuracy assessment effort, state-level results were 
studied to understand the effect of the component parts of 
the error; for example, sample error versus classification 
error and the interaction between classification and sampling 
errors particularly on the area aggregations (Appendix C). 

The results from these tests indicate: 

1. In North Dakota, where the best estimates of classifica- 
tion error are available, the observed relative difference 
does not appear to result from classification error 
(Appendix C). Tests in Montana also tend to confirm 
adequacy of classification. 

2. In all states but Nebraska, classification error is about 
equal to the relatively small sampling error. 

3. In Nebraska, classification error is much larger, indi- 
cating problems with confusion crops (Table C-V, Appen- 
dix C. ) 

h. The random component of sampling error appears to be 
nominal in the four states examined. 


5. In North Dakota, where groiand data for 20 segments were 
compared to SRS county estimates, a difference was 
observed which would account for the negative relative 
difference in North Dakota (Table C-II, Appendix C) . 

6. An estimated random sample error component of 13 percent 
for North Dakota would not account for this relative 
difference (Appendix C), 

7. In the U.S. Great Plains, SRS county estimates were 
substituted for LACIE segment estimates in an aggre- 
gation test to ascertain if any bias due to cloud 
cover was present. Overall no bias could be detected 
except for Colorado. 

S. Preliminary results from the area estimation accuracy assess- 
ment indicate the major components in the relative difference 
(bias) of the LACIE North Dakota area estimate to be sampling 
error (bias) resulting largely from allocation of some sam- 
ples to nonagricultural areas . 

3.2.2 Yield Estimation 

A. Models to project wheat yield for regions within the U.S. 
Great Plains (the "yardstick area") were developed and 
implemented at NOAA/CCEA. Test runs on a regular basis were 
commenced in April 1975. 
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B. A capability to operate yield models at the NOAA/Page Facility 
in Washington, D.C., vas demonstrated in Jiine 1975. 

C. Tests of the U.S. Great Plains yield models for the 197^-75 
crop year show a negligible relative difference (less than 

1 percent) for the total region when compared to SRS results. 
The coefficient of variation was also small (3.5 percent). If 
this result was typical for all years, the yield models would 
support the project accuracy goals. 

D. Tests of the yield models for the U.S. Great Plains were eon- 
ducted retrospectively for the period from I 965 to 197^. The 
results, when compared to SRS data, indicated that the models 
would fall slightly short of meeting the 90/90 criterion. The 
models were improved retrospectively and the tests were remn. 
It now appears that the yield estimates in the U.S. Great 
Plains will support the accuracy goals (see Appendix B). 

3 . 2.3 Production Estimation 

A. The feasibility of estimating production was tested by com- 
bining LACIE area estimates and LACIE yield projections. When 
compared to SRS results, the LACIE at-harvest estimate for the 
region of the nine Great Plains states indicated a relative 
difference of approximately 8.8 percent with the original 
yield models and - 5.6 percent with the revised models. The 
coefficient of variation is 5.3 percent. 

3.3 PROBLEMS 

There were technical and nontechnical problems which arose during 

Phase I. Those described in this section are the major ones 
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vhich were encountered. Some have "been resolved and others 
remain issues. All open items are being pursued as part of Phase 
II activity. 

This section (3*3) is intended to give brief descriptions in one 
location of the major problems encountered. These descriptions 
shOTild be read in conjunction with Section It.O and Appendices A, 

B, and C to gain a valid assessment of the significance of these 
problems . 

A. The interpretation of the Landsat data themselves for training 
the classifier was generally successful except that it was 
consistently difficult to discriminate between wheat and other 
small grains (oats, barley, rye). This is still an open issue. 

However, two approaches are being p-ursued. One is to make an 
estimate for small grains as a class. This is a useful esti- 
mate in and of itself. A second approach is to apportion the 
total area estimated to be in small grains into wheat and other 
according to the historic prevalence of wheat in each locality. 
This "ratioing" technique is expected to give a valid esti- 
mate for wheat and initiate the construction of a historical 
data base of consistent estimates utilizing Landsat input. 

B. A basic element intended in the LACIE classification approach 
was the use of mialti temporal analysis; i.e., using the data 
from multiple Landsat passes in the analysis. The initial 
implementations which were unsuccessful were successfully 
corrected and limited use was made of selected multitemporal 
data sets in the rework of the U.S. Great Plains. Use of 
multitemporal analysis will continue during Phase II. 

C . Another major element in the LACIE technical approach is the 
use of signature extension to amplify the training knowledge 


from one or more segments to one or more neighboring segments 
of similar characteristics. An initial implementation was 
utilized during the first several months of Phase I. The 
results, however, were not satisfactory, and signature extension 
appeared to work in only about 20 percent of the cases. The 
LACIE Research, Test, and Evaluation (RT&E) activity has 
recently produced an improved signature extension technology, 
and activity is planned for Phase II to advance and test sig- 
nature extension capabilities so that this technology can be 
utilized in Phase III. 

D. Historical agricultural data (growth stages, yield, etc.) 
were often not available in consistent format or at the right 
level of detail for full utilization. This hampered the 
development of yield models, adjustable crop calendars, and 
data packages to aid in classification of Landsat data. 

Adequate data to support activities in the U.S^ Great Plains, 
the yardstick region where analytical techniques are 
calibrated, are expected for Phase II. All the historical 
data that may be desired may not be available in other parts 
of the world. This is being taken into consideration, and 
analysis techniques are being structured accordingly. 

E. Crop calendars incorporating seasonal adjustments for winter 
wheat in the U.S. Great Plains were not available early in 
Phase I, and data for the first (fall biowindow) acquisition 
were therefore timed according to average calendars* The 
actual situation for winter wheat in the fall of 197^ was such 
that planting and wheat growth were substantially delayed. 

Thus , data gathered at a time when wheat would normally have 
emerged showed only bare soil. This is not expected to be a 
problem in Phase II since data from all Landsat passes are now 
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being acquired and examined. This allows a determination to 
be made as to whether or not a particular extent of emergence 
has occurred and only when the crop is sufficiently advanced 
will analysis be continued. 

F. Because an estimate of wheat production early in the crop 
year is considered especially valuable, it has been a project 
concern to produce estimates as early as possible. During 
Phase I, an attempt was made to arrive at an area estimate 
using fall data which (as stated in paragraph E above) showed 
little wheat emerged. The approach was to classify areas of 
seed bed preparation or b^sre soil as "potential wheat*" How- 
ever, fall plowing and seed bed preparation are conducted in 
many areas for purposes other than planting wheat , and thus 
LACIE gave a higher area estimate (by a factor of 2 or more) 
than SRS data. 

G. The high area estimates noted early in the season persisted 
through the crop year as a result of retaining a substantial 
amount of the early biostage segments for which "potential 
wheat" was estimated. These estimates were used for segments 
which had no later acquisitions. The estiraates were some 4 0 
percent high for the U.S. Great Plains. A number of possibil 
ities to improve the estimates were determined in detail by 
participants during and after the Area Estimation Technology 
Review conducted in August 1975 • 

It is felt that the major causes of the high estimates in 
addition to the example described in section 3-3 (F) were 
(l) cases in which wheat could not be separated from small 
grains and other crops and (2) cases in which an ambiguous 
classification would be arrived at, such as results for three 


overlapping classes - "winter wheat," "spring wheat," and 
"wheat." This situation has been resolved by a consistent 
and mutually exclusive set of class and subclass definitions 
plus a procedure for apportioning gross categories like "small 
grains" among the specific classes allowable. 

H. The operations for analysis of Landsat data during Phase I 
were characterized by a nmber of "start up" situations pecu-^ 
liar to the particular implementation of the experiment and 
by the high level of rework required as procedures were 
refined. This led to a median processing time of AO days from 
acquisition until completion of the analysis. It is deduced 
that a l4-day turnaround could be attained in a three-shift 
production operation. 

I. As a result of the general magnitude of the LACIE task and, 
in part, because of the rescoping to meet budget, an auto- 
mated status and tracking system was never implemented dur- 
ing Phase I, and tracking was done manually. A good pictiire 
of just where segment processing stood was not always avail- 
able, nor could progress be statused by geographic location, 
biowindow, etc. An improved status and tracking system is 
now available, and the problems experienced are in no way 
basic to the LACIE approach. 

J. Certain problems were found in the sampling. One is in the 
incorrect placement of samples in nonagricTiltural areas due 

to lack of proper delineation of such regions (see 3.2.1 (R)). 
Another problem concerns the assumption that cotmties are rel- 
atively homogenous. Actual experience has not supported this. 

effects have yet to be verified and quantified, but' they 
may require that a new set of segments be defined for Phase 
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Ill in selected areas. Landsat data coupled with topography 
soil families^ and climatic data provide the basis for the 
delineation of areas to be sampled, and hence any improve- 
ments of this type deemed to be desirable will be carried 
out for both foreign and domestic regions. 


1 
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ATTAIIMENT OF OBJECTIVES 


With respect to the major objectives set forth in Section 3/1, 
the interim evaluation is described in the following paragraphs . 

General 

An evaluation was made of a niamber of general items not tied to 
any one aspect of the experiment. In particular, the following 
should be noted. 

A. The data acquisition and analysis system that was planned 
(including various elements at different locations) was 
developed in a timely manner and generally performed well* 
Further, it was upgraded in significant ways during the course 
of Phase I. The mechanical aspects of the design were satis- 
factory in being able to carry out all the planned functions 
and produce the required products . There were three signifi- 
cant shortcomings in the overall LACIE system. The most ser- 
ious was the relatively long time it took to get analysis pro- 
ducts (film, computer runs, etc. ) back to the analysts as a 
segment moved from one stage of processing to another. A 
second problem was the absence of an automated status and 
tracking system and a manual woiicaround was req^red. ^^^ T^ 
third was that only a relatively simple aggregation system 
was available and this also required cumbersome workarounds, 
such as building a separate data base for each aggregation. 

All three of these shorccomings are being corrected for 
Phase II. ^ ^ 




B. The LACIE system included personnel and procedures as well 
as software and hardware. Staff members were hired and 
trained to support analysis activity as required. Procedures 
were developed for use in analysis of LACIE data, but docu- 
mentation was not as complete as planned for Phase II. Short- 
comings were identified and corrected. 

C. Modifications to the technology were made at many points in 
the LACIE system throughout Phase I. The system, including 
both physical and human elements, has proven to be adaptable 
to change . 

D. The location of the 5x6 nautical mile (n. mi. ) segments used in 
the LACIE analysis of acreage is typically within ±1 n.mi. 

of the target location compared with a specified ±3 n.mi. 

This is for the first acquisition of data for that segment. 

It has been possible to register subsequent acquisitions to 
the first with an accuracy of about 80 meters. 

U.1.2 Area Estimation 

A. Two test results from Phase I pertain to area estimation 
capability: 

1. A very limited early investigation in Kansas, a winter 
wheat region, for 1973-197^ (para. 3.2.1 (D)) would 
indicate, if results, were projected to the national level, 
that the 90/90 performance goal for production would be 
met 


^This assumes an equal distribution of error between area and 
yield and that the bias is within +5 percent. 
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The major effort over the U.S. Great Plains indicates 
the following: 


a. The area estimation results are marginally accept- 
able in supporting the 90/90 production estimation 
criterion. The accuracy for winter wheat in the 
southern U.S. Great Plains appears better than for 
spring and mixed spring and winter wheat regions in 
the northern Great Plains. 

b. A study of state-by-state variations indicates that a 
major source of error in the estimate of spring small 
grain area in the spring wheat States is sample 
error in North Dakota. This error is thought to 
result from heterogeneities in agriculture within the 
LACIE sample strata (counties ) . In addition, spring 
wheat cannot be adequately distinguished from spring 
small grains, although spring small grains can be 
distinguished quite adequately from other crops . For 
winter wheat, the major source of error appears to be 
classification error in marginal areas such as 
Nebraska, where confusion crops such as alfalfa are in 
abundance. Moderately large but tolerable sample 
error is also noted in the winter wheat states other 
than Kansas. The prognosis at the national level is 
that, given resolution of the problem causing the 
underestimation in North Dakota, LACIE area estimates 
should support the 90/90 . criterion for accurac^ of 
the production estimates. 
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c. A study of intensive test site data, where ground truth 
was available, gave a further indication that the classi- 
fication procedure for developing area estimates from 
Landsat data was perfonming well. A test on 9 segments for 
which the proportion of wheat (or small grains) estimated 
by the LACIE classification procedure could be compared 
with the proportion from ground data, indicates a rela- 
tive difference and a coefficient of variation well within 
the tolerable limits at a segment level to support the 
90/90 criterion. 

d. Two consistency tests show that the area estimation pro- 
cedures are repeatable with respect to analyst performance. 
One test with lU analysts each studying two sites showed 
no statistically significant difference with respect to 
analysts or to the biowindow within which the data was 
acq.uired. Another test with four analyst teams each 
studying nine sites showed no significant difference 
among the teams. Further, this test involved a rework of 
sites which had been processed originally through the 

flow. No significant change was noted between 
the original and the reworked results. 

Classification tests were conducted on exploratory segments 
over fl.l 1 seven LACIE countries outside the U.S. Of the 
exploratory segment acquisitions received at JSC, approx- 
imately one-half were classified and wheat proportions gener- 
ated by CAMS. This was the same proportion experienced for 
all LACIE acquisitions and reflects the processing of the 
exploratory segments by CAMS with the normal Phase I procedures. 


Examination of exploratory imagery obtained in Phase I showed 
that many of the segments were located in nonagriciiltural 
areas. This problem was referred to in paragraph 3.3 J. 
Agricultural-- no nagri cultural redefinition will be repeated 
for the areas in question during Phase II using Landsat 
imagery . 

In the case of the USSR, Argentina, and Canada, the explora- 
tory segments were considered to be representative of the 
comtries' agriculture. The analysts’ qualitative evalua- 
tion of classification tests is that the USSR is likely to 
be straightforward with large fields and homogeneous signa- 
tures. Canada is more difficult than the U.S. because of 
extensive strip/fallow cropping and a greater variety of 
competition crops. 

India and those areas of China with small fields will be dif- 
ficult, and it is not yet known what accuracies can be 
expected. For China, a new selection of exploratory segments 
in one province has been made for Phase II in hopes of gain- 
ing better experience by concentrating in one agricultural 
area. 

Little experience was obtained in analysis of Landsat data 
acquired over Brazil, Argentina, or Canada because relatively 
few acquisitions were obtained. A problem experienced with 
processing of exploratory segments was that of inadequate or 
incomplete ancillary data (see paragraph 3. 3D). 


C. The adjustable crop calendar model works reasonably well 

(given good starting dates) and is almost always a significant 
improvement over the average crop calendar. 

D- The area-estimation technology was tested throughout Phase I. 
Procedural changes were made and further tests conducted as 
the experiment proceeded- It now appears that the area- 
estimation technology will be adequate for LACIE. 

E. Area-estimation accuracy is suffering, although not to an 
intolerable degree, from the lack of data lost to cloud cover. 
A preliminary indication is that excellent classification 
results can be obtained with data from the first and fourth 
biowindows plus either the second or third* Thus, an aver- 
age data return of 2.3 acquisitions per segment is on the 
lean side. Steps to improve this situation are being 
explored. 

F. See Appendix C for a more detailed treatment of accuracies 
obtained in area estimation. 

4.1.3 Yield Estimation 

A. The 1974-75 crop-year results in the U.S. Great Plains would 
indicate, if typical, that the yield models estimations are 
sufficiently accurate to meet the 90/90 production criterion. 


■ ■■ ■ 2 

Based on an estimate of the standard deviation projected to the 
national level and on the assumption that the production bias is within 
+5 percent. 
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B. The 10-year test indicates that the initial models missed 
the 90/90 criterion hy a narrow margin. However, there were 
indications of where improvements were necessary, and some rel- 
atively straightforward measures were taken. The capability 

to project yield was improved and the improvement tested in 
the U.S. Great Plains. The evaluation is that this component 
of the technology will support LACIE goals for future phases 
of the experiment. However, further improvements to selected 
models are planned. 

C. See Appendix B for a more detailed treatment of yield esti- 
mation accuracies. 

It.l.U Production Estimation 

A. The capability of making production estimates at two levels 
of aggregation, the Crop Reporting District (CRD) and the 
state, was demonstrated, and this capability shoiold, with 
some minor improvements support the remainder of the experi- 
ment. The area estimation and yield estimation ^accuracies 
can be improved to meet these production accuracy goals . 

The combination of the area and yield estimates to a pro- 
duction estimate will introduce no further error. 

B. See Appendix A for a more detailed treatment of production 
estimation feasibility studies conducted. 

U.2 TECHNOLOGY SUMMARY 
General 

A major goal of LACIE in general and of Phase I in particular Was 
to validate, where possible, key elements of the technology for 
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crop inventory, and to identify areas in which the technology 
needed strengthening. To a large extent, both aims were accomp- 
lished. 


h.2.2 Technology Validation 

Major elements of the technology that are considered to be vali- 
dated are a capability to: 

A. Search Landsat data, edit a desired area, and conduct a tem- 
poral registration to 1 pixel. 

B. Extract a preselected sample segment to within 1 n.mi. of its 
actual position. 

C. Automatically screen data that exhibit much cloud cover with- 
out discarding good data. 

D. Collect, periodically, multistage *^ground truth data” within 
the U.S. 

^ ^ ^ E Provide large amounts of high-quality film products . 

F. Employ very large scale mass storage and tape storage facility 
for electronic data and track updates, purges, and related 
activities. 

G. Maintain files , logs , and distribution systems for manual 
control of physical data products. 


H. Accurately select (locally) , from Landsat data alone, train- 
ing fields for use in computer classification of multispectral 
data (considered partially validated in view of the difficulty 
in separating wheat from other small grains ) . 
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I. Provide adequate weather data to interpreters /analysts to 
support identification of wheat. 

J. Use single or lu'oltiteinporal data sets for wheat classifica- 
tion by maximum-likelihood techniques. 

K. Automatically process small fields of the type most common 
in North America ( strip/f allow) . 

L. Acquire, process, and transmit necessary meteorological data 
from a worldwide network. 

M. Develop and operate mathematical models to estimate the stage 
of crop development and to project yield- 

N. Status and track a large amount of remote sensing and mete- 
orological data and a wide array of internal and output data 
products . 

Technical and Procedural Issues 


Major elements of the technology and the procedures that require 
strengthening are the following: 

A. Accuracy of area estimates 

B. Accuracy of yield estimates 

C. Ability to acquire and analyze data in a timely manner 


D. Ability to partition study regions and to extend signatures 
from one segment to another segment within the partition 



E. Incorporation of evapotranspiration and other weather- related 
variables within actual crop calendar periods (adjusted for 
current year weather) into yield models. 

F. Applicability of adjustable crop calendar for wheat to various 
confusion crops 

G. Accuracy and detailed local applicability of crop calendar 
starter models 

H. Utilization of meteorological satellite data in crop calendar 
and yield model areas 

I. Capability to provide effective quality control on data and 
analysis procedures 

J . Capability to provide a specific scheduling of LACIE segments 
for processing 

SUPPOETING RESEARCH PROGRAM 

An important part of LACIE is a supporting research and test 

program. Substantial progress was made in a number of areas , 

some of which will contribute to LACIE during the life of the 

experiment. The most noteworthy items are the following: 

A. Alternate yield-modeling approaches were developed and tested 
under contract . Their main advantage is a spatially more 
detailed meteorological input permitting expression of a more 




directly cause and effect relationship, a feature that will 
be incorporated into later LA.GIE models. 

B. An improved crop calendar and starter model for winter wheat 
was developed at Kansas State University. This will be 
incorporated into LACIE. 

C. A field measurements program has been conducted at two “super 
sites” during Phase I. This program will provide, in addition 
to a field data set for LAGIE use, a data set of enduring 
value for remote sensing research. Landsat, aircraft, heli- 

:# copter spectrometer, and field spectrometer data were gathered 
as nearly simultaneously as possible. A third site has 
recently been added to provide a wider range of agricultural 
conditions and the locations now under study are Finney 
Gounty, Kansas (winter wheat ) , Williams County, North Dakota. 

y, 

(spring wheat), and Hand Gounty, South Dakota (both winter 
and spring wheat). 

D. An error model was developed under contract. This model is 
presently in use and will permit the simulation of the 
accuracy effects of changes to various input parameters. 

E. Signature extension research was carried on at the Laboratory 
for Applications of Remotii Sensing (LARS, Purdue University; 
the Environmental Research Institute of Michigan (ERIM) , Ann 
Arbor, Michigan; Texas A&M University; University of Houston; 
University of Galifornia at Berkeley; and Kansas State 
University. This work continues with activity both in the 
project and in the research community. Although signature exten* 
sion is still a major area of technical risk, it is believed 


that LACIE has significantly focused and advanced the develop 
ment of this area of technology and there is reasonable hope 
that a viable capability will exist by the end of the experi- 
ment . 

In summary, substantial progress has been made in validating 
a crop-inventory system based on multispectral remote sensing 
and mathematical yield models . The activity in LACIE has 
provided the best demonstration to data that wheat can be 
identified and the area measured by satellite remote sensing. 
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Coefficient of variation 


Subscripts : 

P 

A 

Y 

C 

S 


Segment level coefficient of variation 
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APPEHDIX A 

SUMMARY OF PRODUCTION ACCURACY ASSESSMENT 


A 1 ACCURACY ASSESSMENT METHODOLOGY 

Phase I goals called for wheat area estimates in a q,uasi- 
operational mode and the yield and production estimates as part 
of the research, test, and evaluation program. This appendix 
discusses the res\ilts of the production feasibility study con- 
ducted on the Phase I LACIE estimates in an RT&E mode. These 
results are examined in terms of the LACIE accuracy goal of 
estimating wheat production at-harvest^ for a coxuitry to within 
10 percent of its true value in 9 of 10 years, referred to as 
the 90/90 criterion. 

In principle, the evaluation of the LACIE production estimates 
against this criterion would req,uire a comparison of the LACIE 
estimates to the ''actual" production for a period of several 
years. This approach is obviously impractical to implement until 
several years of operational experience is obtained. 

In practice, LACIE must estimate its performance parameters from 
data analysis experience acquired to date and draw inferences as 
to the performance of the technology if it were to be operated 
for a span of several years. These inferences can be viewed 
with confidence as long as the conditions under which they are 
likely to be valid are borne in mind. The ability to identify 
wheat, measure its areal extent, and estimate wheat yields 


^It should be understood that LACIE does make production estimates 
throughout the growing season but the valid basis for comparison is the 
at-harvest estimate. 


is dependent to some degree on the extant agric\iltural and 
meteorological conditions; thus, the performance will vary with 
these factors which will change from year to year. For example, 
the estimation of area and yield in unusual or episode years with 
large regions of severe drought or winterkill will certainly be 
more dif f ic\ilt than in normal years in which the response of the 
crop to its environment is better documented and understood. 

In Phase I, the performance of the LACIE production, yield, and 
area estimators were evaluated and the magnitudes of their com- 
ponent errors estimated in the manner described generally below. 
These analyses were conducted through quantitative statistical 
comparisons to ground observations of wheat area and condition, 
historic data published by national reporting services and 
ciirrent year area, yield, and production estimates published by 
the Statistical Reporting Service (SRS) of the USDA. It is 
these latter data which are used as the "actual” or reference 
standard data at the state and national levels. While these SRS 
estimates are not exact, they are believed to be sufficiently 
accurate at the Great Plains level to serve as a reference 
standard for LACIE. At state levels and below, a significant 
part of the difference between LACIE and SRS estimates can be 
attributed to errors in the SRS figures. 

To determine if the LACIE estimators of production were able to 
satisfy the 90/90 criterion discussed above, the performance 
data were used to examine the contention that "The LACIE pro- 
duction estimate for the U.S. is, with a probability of at least 
90 percent, to within ±10 percent of the 'actual* production 
estimate for the U.S." • 


If, as a resialt of these analyses, the contention can be estab- 
lished as false, then implemented technology is examined for 
potential improvements to meet the 90/90 criterion. The magni- 
tudes of the system component errors are examined to determine 
where the emphasis on technology modifications should be focused. 
If the performance analysis provides no basis on which to reject 
this contention, then one has a reasonable expectation that 
in 9 of 10 years, with a range of agricultural and meteorological 
conditions similar to the test data, the LACIE production esti- 
mates would be within ±10 percent of the SRS figiire at the 
national level. 

Resonable expectation is the chosen terminology because, at this 
early date, it is not possible to determine directly from the 
available data the manner in which the lACIE production estimates 
would distribute about the SRS national production estimate. To 
determine this distribution, the LACIE experiment would have to 
be replicated and such replication would require excessive 
resources. In lieu of a knowledge of this distribution, the 90/90 
criterion is evaluated in terms of the estimated variance and 
bias of the production estimator, under the assumption that the 
estimator would produce normally dist ributed estimates in repli- 
cated trials. Under this assumption of normality, the probability 
that the LACIE national estimator will produce an. estimate within 
±10 percent of the SRS national estimate can be related to the 
computed variance and bias of the LACIE estimator. 

Since the production estimator is the sum over the region under 
study of products of area estimates and yield estimates obtained 
for the coincident yield and area strata (e.g., U.S. Crop 
Reporting Districts (CRD) ) , its statistical properties can be 
derived from a knowledge of the statistical properties of the area 



and yield estimators. In Phase I, it was assumed that the 
errors of the yield and area estimators were uncorrelated with 
each other. This approximation can be modified if experience 
reveals that there is indeed some correlation. Under this 
assumption, the coefficient of variation (c.v, ) of the production 
estimator (estimator variance divided by the expected value of 
the estimate) is given hy (c.v.p) = (c.v.^) + (c.v.^) + 

(c.v.^ X c.v.y)^* The c.v. of the area and yield estimators 
(c.v.^ and c.v.^j respectively) are computed by comparison 
to SRS or agri j-Qlt-ural census data at various geographic levels 
using techniques to be discussed in Appendices B and C. Since 
the 90/90 criterion is for the national level and the LAC IE 
estimates are for the Great Plains, the c.v. computed at the 
Great Plains level must be projected to the national level. The 
projection used will be valid if the estimator performance as 
determined in the Great Plains is representative of the remainder 
of the U.S. wheat region. It can be shown that if the variances 
of the production estimator in strata exterior to the Great 
Plains are equal to or less than the strata variances encountered 
in the Great Plains then c.v.p for the national estimate should 
decrease, at the least, in proportion to the square root of 
production increase from the Great Plains to the national level. 
Given the normality assumption, it can be shown that the 90/90 
criterion can be satisfied for a range of c.v.p and bias. In 
case the estimator is unbiased, c.v.p can be as large as 6 percent 
and satisfy the 90/90 criterion. As the magnitude of the esti- 
mator bias increases, there must be a corresponding decrease in 
c.v.p to retain the 90/90 standard. For example , if the bias 
is 5 percent, then the c.v.p must be k percent or less. 



The Mas of an estimator with respect to a particular data set 
is defined to be the average value of the differences between 
the estimates and the "true” value as determined from a set of 
replicated trials using the estimator. Thus, to compute directly 
the bias of the LACIE estimator, a multispectral and meteorolog- 
ical data set would need to be repeatedly analyzed to obtain 
replicated estimates of production. The average difference 
between the reference value and the set of estimates so obtained 
would provide an estimate of the bias attributable to the 
estimator. 

Such an experiment on a large scale is obviously prohibitive; 
however, tests can be conducted to determine the probability 
that the estimator is biased as discussed below. 

Since the production estimator is known to have a random error 
component with magnitude c.v.p, replication of this experiment 
would produce observed relative differences with a distribution 
of values; most of these values would lie in an interval bounded 
by the average relative difference ±c.v.p. For example, 90 P^^r- 
cent of them should be contained in the internal bounded by the 
average relative difference ±1.6^5 c-v.p. Thus, if it is assumed 
that the LACIE production estimator is unbiased; i.e. , the aver- 
age relative difference is zero, 90 percent of the observed rel- 
ative differences should be between ±1.6^5 c.v.p. Therefore, 
for a particular value of the relative difference (given an 
unbiased estimator) , there is less than a 10 percent chance that 
a particular relative difference would lie outside the inteOTal 

±1.61+5 

Thus, in LACIE, the c.v. of the production estimator is computed 
from the data as previously described. If the relative difference 
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between the LA.CIE production estimate and the reference standard 
estimate is between ±1.645 c.v.p, the data are considered insuf- 
ficient evidence to establish the existence of a bias. If the 
observed c.v.p is 6 percent or less, then there is a reasonable 
expectation that the LACIE production estimator will satisfy the 
90/90 criterion. As c.v.p becomes smaller than 6 percent, it is 
known that some degree of bias can be tolerated and the confidence 
that the LACIE estimator will satisfy the 90/90 criterion is 
increased. 

The performance of the LACIE estimator is also examined at geo- 
graphic sublevels within the Great Plains to determine the 
dependence of the performance parameters on geographic factors 
such as cropping practice (field size, rotation systems, etc.) 
and climatology. Since the LACIE estimator is designed for most 
acctirate estimation at the national level, the estimation accu- 
racies at the state levels are considerably poorer than at the 
larger levels; however, examination of the relative size of the 
errors from one locale to another is extremely useful in detect- 
ing problem conditions, i.e., agricultural or climatic condi- 
tions which strongly affect the LACIE estimation performance, 

PRODUCTION ESTIMATION FEASIBILITY TESTS 

In Phase I, several alternative approaches to production estima- 
tion were evaluated. Estimates from three yield estimators as 
well as estimates from two area estimators were combined and 
evaluated for production estimation. In addition, one yield 
estimator was utilized to produce yield estimates at both the 
crop reporting district and the state level to evaluate the effect 
on production estimation accuracy of combining yield with area at 
these two levels. 


The two area estimators utilized differed only insofar as the 
inclusion or exclusion of ’’Group II segments” in LACIE area 
estimates. These are segments within Group II counties in whicli 
wheat is so sparse that one segment is used to estimate the area 
within several such counties. The contribution to accuracy of 
the Group II estimation approach was in question since area is 
more difficult to estimate in segments with small percentages 
of wheat. In one estimator the LAGIE area estimates for these 
segments were used as originally planned in lACIE. In the other 
estimator the Group II segment estimates were not used and these 
counties treated as Group III counties where area is estimated 
using ratios of historic to current area estimates between these 
counties and Group I counties. This test permitted the Group II 
estimation concept to be evaluated for reduction, if any, in 
overall area and production estimation error. 

The yield estimators, discussed in Appendix B were all variants 
of a basic regression approach utilizing monthly average tempera- 
ture and precipitation as the prime weather variables, with a 
trend term to account for other effects on yield. One alterna- 
tive referred to herein as the "flagged” model, utilized the 
basic regression model with quantitative upper and lower bounds 
on the values which the weather variables were not allowed to 
exceed. This approach, a purely heuristic one, was taken to 
eliminate unrealistically high or low values of yield estimates 
obtained with the original approach in some years when unusual 
amounts of precipitation was known to occurs A final variant 
utilized the ’’flagged” model, and an ’’improved fit” for the 
trend term, using the yield data, prior to the 197^-75 crop year. 
This test, referred to as the ’’trend-adjusted” test was conducted 
to determine the errors in production estimation being introduced 
by errors in the determination of the trend term. 
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Comparisons of the LA.CIE and SRS production estimates are at- 
harvest. These at-harvest estimates are made after wheat has 
been observed by Landsat through maturity and after at-harvest 
measurements of the weather variables have been utilized in the 
yield models . 

RESULTS 


Utilizing the yield estimates obtained by basic yield regression 
approach and the area estimates from the planned LACIE area 
estimation approach, these estimates were combined at the crop 
reporting district level and the resulting production estimates, 
summed to the Great Plains level. This produced an at-harvest 
production estimate for the U.S. Great Plains of 1,253,300 bushels 
compared to 1,363,^*00 bushels as estimated by the SRS. The 
absolute difference between these two estimates is about 110,000 
bushels, indicating a relative difference of -8.79 percent from 
the LACIE estimate. The standard deviation computed for the 
LACIE estimate was 66,500 bushels or 5*31 percent of the LACIE 
estimate. This latter percentage, the estimated coefficient of 
variation (c.v.) in the LACIE estimate at the Great Plains level, 
is projected to decrease to k.2k percent at the national level, 
given the conditions discussed in the previous Section, A2. 

Comparing these quantities to those reqiiired to meet the 90/90 
criterion, it is noted that the c.v. of i*. 24 percent projected 
to the national level is well within the 6 percent required for 
the 90/90 estimates. In addition the relative difference between 
the LACIE and the SRS of -8.79 percent is not sufficiently large 
to indicate a statistically significant landerestimate. Thus , based 
on this feasibility test, there is a reasonable expectation that 
the LACIE approach will satisfy the 90/90 criterion. 


Turning to Table A-I, it can be seen that the originally proposed 
LACIE area estimator, when combined with any of the alternate 
yield approaches sho^ald also satisfy the 90/90 criterion. Note 
in addition that the use of the LACIE area estimates in the 
Group II segments (see Section A2) provide improved area and 
production estimates in all cases when compared to the alternate 
area estimation approach in which Group II segment estimates 
were not used. 

2 

Table A-II contains a more detailed comparison at levels below 
the Great Plains for the area estimator utilizing Group II seg- 
ments and the original yield model (first column of Table A-l). 

As noted in Section A2 these performance numbers are computed 
to detect conditions which might degrade the LACIE estimator 
performance. In this table, it can be seen that although a 
considerable fraction of the segments was lost to cloud cover, 
the area estimates at the Great Plains level did not apparently 
suffer to an intolerable degree since they were acceptable for 
making production estimates which met the 90/90 criterion. 

Results for other estimators are shown in Tables A- III and A-IV. 


2 

Since this Evaluation Report was compiled refinements have been 
made using the Lands at mosaics to improve the estimate of the agricul- 
tural area per stratum. These refinements improved somewhat the area 
(and hence production) estimates reported herein, but do not change the 
basic conclusion of the evaluation. At the Great Plains level, the 
relative difference changes by less than l/lO of 1 percent. For one 
state (Montana) the difference is about 1 percent, for other states it 
is negligible. The c.v. is inmost cases reduced (i.e. , less variance 
in the estimate). 
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TABLE A-l.- PRODUCTION FEASIBILITY TEST RESULTS (U. S. GREAT PLAINS) 


YIELD ESTIMATORS 


ORIGINAL REGRESSION MODEL 


FLAGGING 

AT 

STATE LEVEL 


FLAG + TREND 
ADJUST 

AT STATE LEVEL 

CRD 

STATE 

\ 

AREA 

ESTIMATORS 


R.D. = 2.73% 
C.V. = 1.66% 

R.D. = 0.7% 
C.V. - 3.29% 

R.D. = 4.25% 
C.V. = 2.29% 

R.D. = 2.05% 
C.V. = 1.90% 




PRODUCTION 
R.D. ± C.V. 

PRODUCTION 
R.D. ± C.V. 

PRODUCTION 
R.D. ± C.V, 


PRODUCTION 
R.D. ± C.V. 

1 

UTILIZING 
GROUP II 
SEGMENTS 

-10.71 + 5.66 

-8.79 ± 5.31 

-8.75 ± 6.03 

-5.62 ± 5.87 


-8.52 ± 5.75 

GROUP II 
TREATED AS 
GROUP III 

-10.44 ± 8.84 

-10.44 ± 8.94 

-12.73 ± 9.13 

-9.40 ± 8.91 


-12.69 ± 8.79 


R.D. = RELATIVE DIFFERENCE = (LACIE - SRS) ^ LACIE 
C.V. = COEFFICIENT OF VARIATION = \| VAR (LACIE) ^ LACIE 


I 











TABLE A W r RELATIVE DIFFERENCE AND COEFFICIENT OF VARIATION OF LACIE ESTIMATES 
(YIELD-ORIGINAL CCEA MODELS OPERATED AT CRD LEVEL) 


REGION 

NUMBER OF 
SEGMENTS 
UTILIZED/ 
ALLOCATED 

PRODUCTION 

RELATIVE DIFFERENCE {%) 

± COEFFICIENT OF VARIATION {%) 

AREA 

RELATIVE DIFFERENCE {%) 
^COEFFICIENT OF VARIATION (%) 

YIELD 

RELATIVE DIFFERENCE (%) 

± COEFFICIENT OF VARIATION {%) 

WINTER WHEAT 

COLORADO 

KANSAS 

NEBRASKA 

OKLAHOMA 

TEXAS 

24/ 32 
55/ 84 
23/ 35 
29/ 40 
28/ 49 

32.93 ± 20,71 
20.66+ 8.06 
-20.69 ±28.12 
-12.55 + 12.40 
-30.97 + 28.71 

26.10 ± 20.80 
6.501 7.07 
-15.54 1 28.00 
2.98 J 11.19 
-35.14 1 32.62 

9.49 + 5.79 
9.69 + 3.30 
4.95 ± 3.36 
-13.10 + 4.50 
.73 + 4.27 

TOTAL 

WINTER WHEAT 

159/240 

6.01 ± 6.69 

- .17+ 6.95 

5.11 ± 1.92 


SPRING WHEAT 
AND MIXED 
WINTER AND SPRING 
WHEAT 
MINNESOTA 
NORTH DAKOTA 
MONTANA 
SOUTH DAKOTA 


TOTAL SPRING 
WHEAT AND MIXED 
WINTER AND 
SPRING WHEAT 

113/171 

GREAT PLAINS 

272/411 

NATIONAL 

272/637 

PROJECTION 


-33.23 ± 12.72 
-88,75 1 12.91 
-33.21 i 22.65 
-27.79 ± 13.79 


-39.14+ 8.59 


8,79 t 5.31 


-32.28 ± 15.67 
-74.49 ^ 14.81 
-24.19 + 25.94 
27.71 ± 17,65 


-30.14+ 9.75 


-10.71 + 5.66 


COEFFICIENT OF VARIATION 
FOR PRODUCTION = 4.24 


.10 t 4.42 
-26,22 i 7.24 
,13 ±3.71 
44.44 i 3.10 


.74 + 3,01 


2.73 ± 1.66 









TABLE A-lll.- RELATIVE DIFFERENCE AND COEFFICIENT OF VARIATION OF LACIE ESTIMATES 
(YIELD-ORIGINAL CCEA MODELS OPERATED AT STATE LEVEL) 



NUMBER OF 

PRODUCTION 

AREA 

YIELD 

REGION 

SEGMENTS 

UTILIZED/ 

ALLOCATED 

RELATIVE DIFFERENCE m 

RELATIVE DIFFERENCE {%) 

RELATIVE DIFFERENCE <%) 

± COEFFICIENT OF VARIATION {%) 

+ COEFFICIENT OF VARIATION (%) 

± COEFFICIENT OF VARIATION (%) 

WINTERWHEAT 

COLORADO 


33.09 1 21.84 

26.10 ± 20.80 

9.64 + 682 

KANSAS 


20.48 ± 8.11 

6.50 ± 7.07 

14.96+3.98 

NEBRASKA 


-8.11 + 28.40 

-15.54 ±28.00 

6.43+ 4.94 

OKLAHOMA 

WStM 

-12.48 + 14.70 

2.98 + 11.19 

-15.94 ±9.59 

TEXAS 

28/ 49 

-60.21 + 33.37 

-35.14 ±32.62 

-18.56 + 7.41 

TOTAL 

WINTERWHEAT 

1S9/240 

4.93+ 7.01 

- 0.13 + 6.95 

3.80 + 2.86 

SPRING WHEAT 





AND MIXED 
WINTER AND 
SPRING WHEAT 

9/ 13 




MINNESOTA 

-35.28 + 17.21 

-32.28 + 15.67 

- 2..?2± 7.20 

NORTH DAKOTA 

42/65 

-85.13 ± 20.85 

-74.49+ 14 81 

- 6.15 + 14.83 

MONTANA 

39/ 60 

-32.39 + 26.42 

-24.19 + 25.94 

- 6.46 ± 5.21 

SOUTH DAKOTA 

23/ 33 

33.46 ±18.88 

27.71 + 17.65 

7.86+ 6.83 

TOTAL SPRING 





WHEAT AND MIXED 
WINTER AND 
SPRING WHEAT 

113/171 

-35.85 ± 11.41 

-30.14+ 9.75 

- 3.70 ± 6.99 

GREAT PLAINS 

272/411 

- 8.75 ± 6.03 

-10.71 ± 5.66 

.70+ 3.29 

NATIONAL 

ZlWV 

COEFFICIENT OF VARIATION 



PROJECTION 

FOR PRODUCTION - 4.82 



















> 

1 
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TABLE A-IV. RELATIVE DIFFERENCE AND COEFFICIENT OF VARIATION OF LACIE ESTIMATES 
(YIELD-FLAGGED CCEA MODELS OPERATED AT STATE LEVEL) 


REGION 

NUMBER.OF 

SEGMENTS 

UTILIZED/ 

ALLOCATED 

PRODUCTION 

RELATIVE DIFFERENCE (%) 

± COEFFICIENT OF VARIATION (%) 

AREA 

RELATIVE DIFFERENCE (%) 

+ COEFFICIENT OF VARIATION {%) 

YIELD 

RELATIVE DIFFERENCE {%) 

WINTER WHEAT 

COLORADO 

KANSAS 

NEBRASKA 

OKLAHOMA 

TEXAS 

24/ 32 
55/ 84 
23/ 35 
29/40 
28/ 49 

33.09 ± 21.84 
20.48 i 8.il 
- 8.43 i 28.41 
-18.80+13.72 
-45.92 + 32.90 

26.10 + 20.80 
6.50+ 7.07 
-15.54+ 28.00 
2.98M1.19 
-35.14 + 32.62 

9.64+6.33 
14.96 + 3.98 
6.16 + 4.99 
-22.45 + 7.99 
-7.98 + 4.53 

TOTAL 

WINTER WHEAT 

159/240 

4 95 + 7.04 

- 0.13+ 6.95 

4.15 + 2.58 

SPRING WHEAT 
AND MIXED 
WINTER AND 
SPRING WHEAT 
MINNESOTA 
NORTH DAKOTA 
MONTANA 
SOUTH DAKOTA 

9/ 13 
42/ 65 
39/ 60 
23/ 33 

-35.28 H 17.21 
-63.08+ 16.90 
-26.37 + 26.09 
40.94 ±18.55 

-32.28 + 15.67 
-74.49 + 14.81 
-24.19 + 25.94 
27.71 + 17.65 

' 2.32 + 7.20 
6.50 + 8.23 
- 1.62 + 2.98 
18.22 + 5.80 

TOTAL SPRING 
WHEAT AND MIXED 
WINTER AND 
SPRING WHEAT ^ 

113/171 

-24.88 + 10.50 

-30.14 + 9.75 

4.67 ±4.15 

GREAT PLAINS 

272/411 

-5.62 + 5.87 

-10.71+566 

4.25 + 2.29 

NATIONAL 

PROJECTION 

272/637 

COEFFICIENT OF VARIATION 
FOR PRODUCTION = 4.69 









It can be generally stated that the relative difference in the 
SRS and LACIE state level estimates fluctuate evenly on both 
sides of zero, indicating that the LAGIE estimators are not 
significantly biased. At the Great Plains level, statistical 
tests indicate no significant bias in the yield or production 
estimates. However, the LACIE area estimator is significantly 
xmderestimating at this level. A check at the subregion level 
indicates the source of the problem to be the northern Great 
Plains. The winter wheat area in the southern Great Plains has been 
estimated quite . closely. Examining each of the northern Plains 
states the soTirce of error appears to be located in North Dakota, 
where the area difference between LAGIE and SRS is significant. 
Further examination of this problem, undertaken to determine if 
the area estimation problem is sampling error or classification 
error indicated (see Appendix G, Section C4, and Table C-IV) the 
major source appears to be sampling error. Efforts are underway 
to correct this for Phases II and III. 

From the subregional yield performances it can be seen that the 
yield model is relatively less accurate in North Dakota than 
elsewhere and also seems to perform better on winter wheat 
although the model is significantly overestimating yield in 
Kansas. These errors are discussed in Appendix B. 

In summary, the production feasibility tests are quite encouraging 
in that they indicate the 90/90 ci^iterion can be met. Generally 
it would appear that the estimation accuracies are better for 
winter wheat than for spring wheat. There is some concern over 
the performance in North Dakota and this problem is being 
investigated for solutions in Phases II and III. 


ACCURACY ASSESSMENT METHODOLOGY 


As discussed in Appendices A and C, error budgets have been 
developed for accviracy assessment which permit an evaluation of 
the utility of a yield estimator as a component in a 90/90 pro- 
duction estimator. This analysis requires that the area esti- 
mator be unbiased, that its errors be uncorrelated to the errors 
in yield estimation, and that it have a coefficient of variation 
(c.v.) of 4.25 percent or less. If the yield estimator can be 
shown to satisfy the same criterion as the area criterion, then 
it is judged a sviitable estimator. 

Because the yield estimator is a regression-type estimator, 
developed from an existing historic data base of reported 
yields and recorded weather, it was possible to conduct some 
evaltiations using this historic base. These were in addition 
to tests described in Appendix A in which the Phase I yield 
estimates were combined directly with Phase I area results and 
evaluated. 

Based on the historic yield and meteorological data for the 11 
years from 1965 to 1975 for the U.S. Great Plains, eleven sep- 
arate trials were r\m in which the regression models were 
developed on years of record prior to each of these years and 
then exercised on the test year. For each of the 11 test years, 
performance was evaluated in two different ways. 

In the first approach, the coefficient of variation of the yield 
estimate was computed as the standard error of the regression 




estimate divided by the LACIE value for the yield. In addi- 
tion, the observed relative difference between the SRS reported 
value and the LACIE estimated value was also computed. These 
performance data are computed for estimates at both the state 
and CRD levels. 

The coefficients of variation computed for the Great Plains 
are then projected^ to the U.S, national level and compared 
to the 4,25 percent criterion. If this criterion is satisfied 
and the bias test does not detect a bias, then the model is 
judged satisfactory. 

An alternate method using the historic data base for evaluating 
the ability of the LACIE yield estimator to satisfy the 90/90 
criterion has been developed based on con 5 >arisons of the prod- 
ucts of the LACIE yield estimates and the SRS area estimates to 
the SRS production estimates for each of the test years. Since 
for a given year the products of SRS reported area and the 
reported yields equal the SRS reported production, the differences 
between the SRS production figures and the test production 
estimates so obtained can be attributed solely to differences 
in the SRS and LACIE yields. These differences will, of course, 
be wei^ted by the reported area in the various strata, 

A criterion has also been developed to ascertain the statisti- 
cal properties which these test production estimates must have 
in relation to the 90/9.0 criterion. To develop the test 


^This projection assumes c,v,y to decrease in proportion to the 
square root of the increase in production from the Great Plains to the 
national level , 


estimate criterion, equal amounts of random error are attrib- 
uted to the yield and area estimators. Under this assumption 
it can be shown that if, in eight out of ten test cases, produc- 
tion estimates at the Great Plains level are within a tolerance 
bound of ± 9.5 percent of the SRS production estimates, 
the LACIE yield estimates can be reasonably expected to satisfy 
the 90/90 criterion at the national level. In addition, toler- 
ance bounds at the state levels can also be computed by 
assuming an increase in the Great Plains tolerance bounds pro- 
portional to the square root of the decrease in total produc- 
tion to the state levels. Thus, if eight of ten of the state 
test production estimates fall within these state level toler- 
ance bounds , the state estimator is judged adequate. In such 
a case, the yield estimator could be expected to produce 90/90 
estimates for a region producing about the same amount of 
wheat as the U.S. and in which agricultural and climatic con- 
ditions were similar to those of the particular test state. 

YIELD ESTimTION FEASIBILCT^ TESTS 

One basic yield estimation approach was tested in Phase I over 
the Great Plains, with uwo variants of this basic approach also 
evaluated for assessment of potential improvements « The basic 
regression approach utilized monthly average temperature and 
precipitation as the weather variables with a trend term to 
account for other effects. 

One alternative , ref erred to herein as the "flagged’’ model, 
utilized the basic regression model/with quantitative upper and 
lower bounds on the values which the input variables were not 
allowed to exceed. If the monthly average precipitation 
exceeded the 90th percentile value , or if the monthly average 


B-3 


temperature exceeded the 5th or 95th percentile value, as 
determined from historic data, the value of the input variable 
for the model was set to that particular percentile value. 

This approach, purely heuristic, was taken to eliminate unreal- 
istically high or low values of yield estimates obtained in 
certain instances in the evaluation of the original model. Of 
course, flagging daily, instead of monthly, values of these 
parameters should be more effective in eliminating effects due 
to anomalous meteorological phenomena, but for Phase I such 
data was not used in the LACIE models. 

A final variation utilized the "flagged" model and an "improved 
fit" for the trend term. This fit was chosen using the yield 
data prior to the 19T5 crop year. 

The original regression model was also exercised at both the 
crop reporting district and at the state level, the alternates 
at only the state level. In the U.S. Great Plains there are 
models for each of twelve regions, each model developed by 
conducting a regression of historic yield values for the 
region against the historic meteorological data for the region. 
Once the coefficients have been determined, the weather at any 
level can be input to the model to obtain a yield estimate. 
Thus, in anticipation that a combination of the LACIE yield 
and area estimates at a geographic level below the state might 
be more optimum for production estimation, a test was conducted 
using the crop reporting district yield estimates obtained by 
exercising the models with weather for the crop reporting dis- 
trict . It should be noted this is at best an approximation to 
the performance obtainable by developing regression models for 
each crop reporting district, an approach anticipated to be 
more accurate. 
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RESULTS 


In summary, the variety of tests on the initial yield models 
indicated they are marginally suitable as LACIE estimators. In 
reviewing the resTilts in more detail, it was discovered that 
one prime contributor to the errors was the mathematical form 
of the regression models, which created unrealistically high or 
low yields when the monthly averages of the input meteorological 
variables tended toward extremely hi^ or low values. 

A modest change to these models was attempted by "flagging" the 
values, i.e., defining ranges which the input values are not 
allowed to exceed as described in Section B2. This change pro- 
vided enough improvement so that the performance of these mod- 
els is now judged satisfactory for a 90/90 production estimation 
as opposed to marginal as originally implemented. 

The detailed results obtained by comparing the original model 
against the 4.25 percent criterion at the national level, as 
discussed in Bl, is shown in Table B-I. Here we note that the 
CRD model has , on the average, a smaller c.v. than does the 
state model. The c.v. of the CRD model satisfies the 4.25 
percent criterion in all years whereas the state model fails in 
2 of the 11 years. However, the large relative differences 
(lO percent) observed in 3 of the 11 years with these models 
are of concern. Based on this data, the. CRD model was judged 
marginally suitable. Table B-II shows the same results when the 
"flagged" model was exercised. The results of the analysis of 
LACIE yield estimates for the 11-year period using the test 
method discussed in Bl are stamnarized by the graphs in figures 
B-1 and B-2 for the original and flagged models , respectively. 
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TABLE B-I.- ESTIMATED COEFFICIENT OF VARIATION AND RELATIVE DIFFERENCE 


OF YIELD ESTIMATES AT THE NATIONAL LEVEL (ORIGINAL YIELD MODEL) 


Year 

Yield estimated at the 
Crop Reporting District level 

Yield estimated at the 
State level 

Relative 

difference, 

percent 

(a) 

c.v, , 

percent 

(b) 

Relative 

difference, 

percent 

(a) 

c .v. , 

percent 

(t) 

1975 

0.9 

0,4 

6.0 

1.6 

I97U 

18.5+ 

1-0 

i 4 . 7 + 

2.3 

1973 

18,6+ 

3.0 

13 . 9 + 

5.5 

1972 

-1.7+ 

.8 

- 2.1 

2.0 

1971 

-9.4+ 

.8 

-9.5+ 

2.4 

1970 

- 6 ,. 8 + 

1.0 

-7.7+ 

2.7 

1969 

it. 9 ^ 

1.0 

5.4+ 

2.0 

1968 

1-5 

1.0 

-5.5+ 

2.4 

1967 

- 7 . 7 + 

1.6 

-7.8 

4.5 

1966 

11.9+ 

1.4 

10.3+ 

2.8 

1965 

- 0.8 

1.3 

-1.9 

3.0 


^‘Actual calculated at Great Plains level. Significance test utilized 
computed c.v, for Great Plains, 


^Projected from Great Plains to national level under the ass\anption 

that c,v. will decrease in proportion to increase in production, 

NOTE: The relative difference is normalized with respect to 

SRS production estimates because these were readily 
available for the lO-year retrospective test period. 

















BARS ARE TOLERANCE BOUNDS PROJECTED TO NATIONAL LEVEL (USinO 1974 SRS PRODUCTION PROPORTIONS) 
TOLERANCE BOUNDS ASSUME AREA COEFFICIENT OF VARIATION TO BE EQUAL TO YIELD COEFFICIENT OF 
^ VARIATION 

P-TEST PRODUCTION BASED ON LACIE YIELD ESTIMATES AT ”STATE“ LEVEL 
P=SRS PRODUCTION 


Figure B-1.- Eleven-year (1965-1975) yield model evaluation (original models) 
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BARS ARE TOLERANCE BOUNDS PROJECTED TO NATIONAL LEVEL (USING 1974 SRS PRODUCTION ESTIMATES) 
TOLERANCE BOUNDS ASSUME ACREAGE COEFFICIENT OF VARIATION TO EQUAL TO YIELD COEFFICIENT OF VARIATION 
$ = PROJECTED PRODUCTION BASED ON LACIE YIELD ESTIMATES 
P = SRS PRODUCTION 

Figure B-2.- Eleven-year (1965-1975) upgraded yield model evaluation. 




Recall that in this method the analysis of yield estimates is 
in terms of the capability to contribute to acceptable produc- 
tion estimates, given accurate area estimates. Production in 

this case is computed by multiplying SRS state area estimates 

2 

by LACIE state yield estimates . The relative differences of 
the resulting production estimates are indicated by the dots on 
the graph, and the numbers next to those dots refer to the cal- 
endar year for which each estimate was made. The bars on the 
graph are tolerance bounds on these relative differences pro- 
jected to national level. Eight out of ten of the relative 
differences falling within the tolerance bounds indicates the 
acceptance of the hypothesis that the 90/90 production crite- 
rion at the national level is met. The test of similar hypoth- 

3 

esis is done for the individual states to determine which types 
of geographic areas that may represent problem areas. 

It is seen in figure B-1 that the hypothesis would be rejected 
for the Great Plains; in other words, the yield estimates con- 
sidered collectively over the nine Great Plains states do not 
support the 90/90 criterion. The figure indicates that a par- 
tial explanation for this conclusion can be traced to the gen- 
erally poor estimates in 1973 and 197^ and to the poor per- 
formance of the North Dakota and Kansas state yield models. 

These also are the only two geographic "areas" for which the 
hypothesis would be rejected at a national level if the entire 
cotintry behaved like either of these areas. In the case of 


Relative difference (percent) = 

LACIE product xon 


%ote: The 90/90 criterion is applicable only to a country level. 


North Dakota and Kansas, the results shown in figure B-1 tend 
to support this conclusion. In that figure, a hias is indica- 
ted in the Kansas yield estimate and a large variance is shown 
to he associated with the North Dakota yield estimate. 

PROJECTION TO FOREIGN AREAS 

It should be kept in mind that these accuracy figures apply to 
the U.S. Great Plains yield models. Accuracies may degrade to 
some extent in those foreign areas where historical yield and 
weather data bases are less adeq.uate for modeling and real-time 
weather inputs to models rely on very sparse reporting networks. 


APPENDIX C 

SUMMARY OF AREA ACCURACY ASSESSMENT 


Cl ACCURACY ASSESSMENT METHODOLOGY 

In Appendix A, the methodology for the assessment of the LACIE 
wheat production estimator accuracy was descrihed. Given cer- 
tain assumptions regarding the manner in which the production 
estimates would distribute about the reference production 
estimate (the assumption of normalcy was invoked) , methods were 
outlined for relating the variance and bias of the production 
estimator to the 90/90 criterion. It was concluded that, in 
case statistical tests do not detect bias in the estimator 
and the computed coefficient of variation is 6 percent or less, 
there is a reasonable expectation that the production esti- 
mator satisfies the 90/90 criterion. The term reasonable 
expectation is expounded on in some detail in that appendix. 

Since the LACIE production estimator is the sum of products of 
the area and yield estimates obtained for the coincident yield 
and area strata (e.g. , U.S. crop reporting districts) covering 
the survey region, its statistical properties can be derived 
from a knowledge of the statistical properties of the yield 
and area estimators . 

An approximate^ relation has been derived which expresses the 
c.v. of the production estimate (c.v.p) in terms of the c.v. 


^A more exact expression involves sums of coefficients of varia- 
tion obtained at the stratum level. 


of the area estimate (c.v.^) and the c.v. of the yield esti- 
mate (c.v.y). This expression is 

(c.v.p)^ = (c.v.^)^ + (c.v.y)^ + (c.v.^ X c.v.^)^ (C-l) 

In LACIE, this relationship permits the development of an error 
budget which permits separate criteria to he established for 
the area and yield estimators. In Phase I it was assumed that 
the yield estimator would be as accurate as the area estimator 
and vice versa. Thus, c.v. ^ was assumed equal to c.v.y. Under 
this hypothesis, equation G-1 can be solved for c.v. ^ = c.y.^ 
to ascertain what value of these parameters would be 
required to obtain the c.v.p of 6 percent needed for 90/90 
estimates. The values so obtained are c.v.^ = c.v.^ ^ 4.25 
percent . 

Thus, if the area estimator is shown to have a c.v. of ^4.25 
percent and is unbiased, it is considered, with reasonable 
expectation, to be a satisfactory component in the overall 
production estimator — similarly for yield. 

However, it should be remembered that the final test is the 
combination of area and yield as was discussed in Appendix A, 
The error budget simply provides a method for ascertaining the 
gener.'il quality of the area and yield estimators independent 
of each other in relation to the 90/90 criterion for production. 
In fact, if the area estimator has a c.v. of greater than 4.25 
percent and the yield estimator less than 4.25 percent, the 
production estimator could still be satisfactory. 

In addition, the 4.25 percent random error assignment to area 
permits a more detailed evaluation of the random components of 
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the sampling and classification error contributions to the area 
estimator. The random component of the sample error is a 
measure of the degree to which, in replicated sample draws, 
the wheat area contained in the LA.CIE samples represents the 
wheat area contained in the survey region being sampled. The 
random component of the classification error is a measure of 
the degree of repeatability with which the LACIE Classifica- 
tion and Mensuration Subsystem (CAMS) could estimate, in rep- 
licated trials, the area contained in one or more LACIE samples 
The total area estimator random error component is, of course, 
a measure of the degree of repeatability with which the LACIE 
area estimator could be expected, in replicated trials, to 
estimate the actual area contained in the survey region. 

The assumption has been made that the classification and 
sample errors are independent ; i.e., the classification 
error is not systematically affected by the sample location. 
Under these conditions, the coefficient of variation of the 
total area estimate can be expressed in terms of the random 
components of the classification error c..v.^ and sample 

error c.v.Q as 

. . . o . 

(c.v.^)^ = Cc;v.^)^ + (e.v.g^ (G-2) 

C.V.Q has "been estimated in LACIE to be about 2 percent at the 
national level . Since c .v.^ at this level should be about 
U.25 percent, this, would, aGcording to equation, C-2 permit 
a random component to the classification error of 3.7^ percent . 

What is meant by this latter statement is , if the LACIE 
samples allocated nationally were repeatedly classified in 
independent repeated trials, the coefficient of variation of 



the set of estimates of the areas contained within the N 
sample segments should be 3-7^ percent or less if the clas- 
sification technology is to be judged suitable as a component 
in the overall production estimator. 


This latter criterion is a very important one since it pro- 
vides a method for assessing the viability of the classifica- 
tion technology against a quantitative criterion. This cri- 
terion can also be used to determine the allowable magnitude 
of the random component of classification error for any num- 
ber (n) of segments by the relation 


n \ ^t 

~Jn °*^*A 


(C-3) 


Thus, assuming that h31^ of the 637 segments will he acquired 
cloud free, the allowable random error for a collection of n 
such segments would he 


..V.” X 3.7W 


(C-1.) 


Thus j for a single sample segment the tolerable random error 
component is given for n = 1 or approximately 80 percent. 

Thus, if the classifier is unbiased and U31 of the 637 sample 


segments are acquired suitably for classification, the area 


637 in the U.S. 

Based on statistics from Phase I, 




estimate for a 5 ^ 6 n. mi- segment must be, in a majority 
of instances, to within about 80 percent of the true wheat 
area contained by the segment. 

A similar analysis for sampling error, based on the 2 percent 
goal at the national level indicates that on a per segment 
basis the tolerable random component is about 1+0 percent. 

This can be interpreted to mean that the actual wheat preva- 
lence in the sample segment should be to within about 1+0 
percent of the actual prevalence in the stratum in a majority^ 
of instances . 

Tests have shown these random error magnitudes are obtainable 
given the currently implemented LACIE technology. Thus, seg- 
ments must be allocated and analyzed in a manner which mini- 
mizes bias. Bias in classification results from mistaken 
identification of wheat as nonwheat and vice versa. If on 
the average these mistakes tend to cancel, the segment area 
estimator will be unbiased. Thus, the aim of classification 
technology is to produce the smallest possible error rate in 
a manner for which classification of wheat as nonwheat tends, 
on the average, to cancel the mistaken identifications of non- 
wheat as wheat. 

Sample error in the form of bias can also creep into the 
design, even thotigh the sample selection is random. Such bias 
can result purely from a "luck of the draw" phenomenon; that 





is, any particTilar configuration obtained in a sample draw 
has a probability to contain either more or less wheat then 
is in the sampled region. Since the LAGIE sample remains 
fixed^ from year to year, a particular sample configuration 
will contain a fixed bias. 

C2 AREA ESTIMATION QUASI-OPERATIONAL TESTS 

In Phase I, three sets of area estimates were produced for 
the U.S. Great Plains. The initial quasi-operational system 
produced area estimates real-time. This operation was pri- 
marily concerned with ’’debugging” the system. Several serious 
implementation problems were uncovered in this real-time 
operation. In lieu of a real-time cropping calendar, the 
Landsat data was acquired at dates determined from historic 
calendars. Using this approach most of the Landsat data 
acquired early in the growing season in Phase I was acquired 
before the wheat had emerged and became visible on the Landsat 
imagery. Because of the importance of early estimates, 
area estimates were attempted using this data by declaring 
areas of seed bed preparation as "potential wheat,” Since 
seed bed preparations are made for other crops, the LACIE 
estimates were considerably larger than the actual wheat area. 

These system problems were corrected and the Landsat data 
reanalyzed by the LACIE CAMS, The resulting area estimates 
based on this reanalysis are referred to herein as the CAMS 
rework estimates. 


A minority of the sample segments will change from year to year 
resulting from variable loss to clotid cover. 
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Two estimates were made using the CAMS rework data. These 
two estimates differ only in regard to the inclusion of 
GroxQ) II segments. These segments, a minority in the total 
segment complement, are those segments within Group II 
coiinties which are so sparsely planted to wheat that one seg- 
ment is used to estimate the area within several such 
counties. The Group II segments often contain less than 5 
percent hy area of wheat. Initially CAMS attempted to train 
the classifier and classify the segment utilizing the maximum 
likelihood classifier. It was found that as a result of 
inadequate training data and an abundance of confusion crops in 
such segments, this procedure tended to overestimate the 
amount of wheat contained. A modified procedure was developed 
in CAMS to estimate the wheat area in these segments. Pre- 
liminary indications were that the overestimates in these seg- 
ments have been corrected. However, final judgement was 
reserved following comparisons of wheat area estimates and 
variance estimates obtained by aggregating with and without 
these segments. 

RESULTS OF THE ASSESSMENT 

After correction of the significant problems in the initial 
implementation of the LACIE area estimation technology, the 
resulting area estimates satisfied the 90/90 criterion for 
production, in terms of criterion of being an unbiased esti- 
mator with a c.v. of less than 4.25 percent and, in particu- 
lar, when combined with the actual LACIE yield estimates (see 
Appendix A) . 


The acctiracies obtained using the rework estimates, including 

7 

Group II segments, are shown in Table C-I. Note that the 
coefficient of variation for this estimate projected to the 
national level is 3-7^ percent, somewhat smaller than 
the U.25 percent deemed desirable in the discussion of the 
previous section, and thus some bias is tolerable. However, 
the relative difference of -10. T percent at the Great Plains 
level is sufficiently large to indicate a bias given a c.v,^ 
of 5.66 percent at that level. Recall also that when these 
area estimates were combined with the yield estimates , the 
resulting production estimate could, with a reasonable expec- 
tation, satisfy the 90/90 criterion. 

From these results in table C-I, the area of most concern as 
regards problem isolation and correction is North Dakota. 

More detailed ground truth and ancillary error analyses in 
Kansas, North Dakota, Nebraska, and South Dakota permitted 
a more detailed assessment of the sampling and classification 
errors. These analyses, to be discussed in Section Cl|, 
indicated the source of the North Dakota problem to be 
sample error. 


‘Since this Evaluation Report was compiled, refinements have been 
made using the Landsat mosaics to improve the estimate of the agricul- 
tural area per stratum. These refinements improved somewhat the area 
(and hence production) estimates reported herein, but do not change the 
basic conclusion of the evaluation. At the Great Plains level, the 
relative difference changes by less than l/lO of 1 percent. For one 
state (Montana) the difference is about 1 percent, for other states is 
negligible . The c.v. is in most cases reduced (i.e., less variance in 
the estimate). 


TABLE C-i.- ACCURACY OF AT-HARVEST ESTIMATES OF WHEAT AREA 
USIHG CAMS REWORK DATA (GROUP II SEGMENTS INCLUDED) 


■ ’ 

Region 

Number segments 
utilized/allocated 

Computed 

relative difference, 
percent 

Coefficient of 
variation, 
percent _ 

Winter wheat 
Colorado 

2l*/32 

26.1 

20 . 8 

Kansas 

55/84 

6.5 

7.07 

Nebraska 

23/35 

-15.5 

28.0 

Oklahoma 

29/1+0 

3.0 

11.2 

Texas 

28/1+9 

-35.1 

32.6 

Total winter wheat 




states 

159/21+0 

-O.IT 

6.95 

Spring /winter 




Minnesota 

9/13 

-32.3 

: 15- 7 

N. Dakota 

1+2/65 

-71+.5®- 

11+.6 

Montana 

39/60 

-2I+.2 

25.9 

S. Dakota 

23/33 

27.7 

17.7 

Total spring/winter 
mixed states 

113/171 

-30.1 

9.75 - 

Great Plains 

272/1+11 

-10.7 

5^66 

Projected to national 

272/637 


3.74 


a 

Significant relative difference indicates potential bias. 






Table C-II indicates the results when Group II segments were 
not included in the area estimator, and the associated 
Group II counties were treated as Group III counties. As can 
be seen by comparing Table C-I to Table C-II, the area 
estimates are significantly better when the CAMS area esti- 
mates in Group II segments are used in the aggregation. 

ESTIMATION OF AREA ERROR USING BLIND SITE DATA 

The expression "blind site" is merely a designation applied to 
selected operational segments for which, unknown to the 
analyst, ground truth data was acquired for subsequent eval- 
uation purposes, The implementation of this approach occurred 
late in the growing season of LACIE Phase I. Thus, all of 
the selected sites fell in the northern spring wheat regions. 

High resolution color infrared aerial photography over twenty- 
nine LACIE segments in North Dakota and Montana (the results 
from only l6 of these segments in North Dakota are relevant 
to the basic discussion which follows ) was acquired in mid- 
August 197'5- Simultaneously, field teams were collecting 
ground information for a substantial portion of these segments. 

These data were combined to obtain both field and total seg- 
ment ground truth data* The small grain proportion esti- 
mates were statistically/ compared to the LACIE estimates for 
the l6 segments in North Dakota. This resulted in a direct 
computation of the classification error, c,v,^, for the state 
of North Dakota as shown in Table Cf-III* 
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TABLE C-III.- LAGIE BLIND SITE DATA 


(North Dakota spring small grains) 


County 

Fraction of 

area in small grains, percent 


Ground truth 
(5x6 n. mi. segment) 

LACIE 

(5x6 n. mi. segment) 

SRS county 
(whole county) 

Ward 1 

13.2 

17.1 

33.8 

Ward 2 

26.8 

8.2 

33.8 

Williams 

3.7 

0 

27.5 

McHenry 1 

0 

0 

25.9 

McHenry 2 

0.3 

0 

25.9 

Rolette 

i ».9 

-- - 

18.8 

Ramsey 

38.4 

49.5 

41.5 

McKenzie 1 

1.3 


10.6 

McKenzie 2 

1.0 

0.3 

10.6 

Mclean 

29.3 

28.4 

31.7 

Mercer 

16.3 

18.0 

19.9 

Oliver 

15.6 

” 

16.2 

Kidder 

16.4 

— 

19.4 

Sheridan 

12.9 

0 

30.9 

Adams 

26.1 

24.4 

22.8 

Hettinger 

21.7 

24.1 

35.7 

Burleigh 

18.2 

12.0 

20.7 

Morton 

4.6 

6.7 

15.7 

Richland 

31.6 

15.6 

36.2 

Sargent 

. ■ 

35.0 1 

32.3 

34.7 

■ ' 

17.46 LACIE 16 

14.78 

— 

Average 

15.87 ALL 20 

— 

26.00 


Correlation high between LACIE and ground truth r = 0 ,b 4 y» 
Variance of LACIE estimates is within allowable range, c.v. 
No apparent bias in LACIE estimate. 


= 50 percent 












This table indicates a relative difference between the clas- 
sified wheat proportion and the ground observed proportion of 
-15 percent of the ground observed proportion - this is not 
indicative of a significant bias in view of • the standard error. 
However, the difference between the ground truth estimate and 
the SRS county figures would explain the underestimate obtained 
in Horth Dakota. Thus, for Horth Dakota it was concluded that 
sampling error was the major source of the observed bias. Other 
investigations with full frame imagery confirmed this, in that 
agriculture is very heterogeneous in this region and ma.ny of 
the LACIE segments do not adequately represent the county. 

ESTIMATING THE SAMPLING ERROR AT THE SEGMENT LEVEL 

In four states (Kansas, Nebraska, North Dakota, and South 
Dakota) the sampling error was estimated for selected counties 
(chosen primarily because of sufficient Landsat acquisitions). 
These estimates were for small grains. The estimates were made 
by a scheme using the full frame Landsat color infrared imagery 
in the following manner . 

• The Landsat full frame was partitioned into ^ x 6 n. mi. 
segments . 

• A sub sample of these segments was used which was within 
county boundaries for selected counties . 

• A grid containing 200 points was overlaid on the selected 
segments . 

• An analyst then determined from imagery at each grid point 
whether either wheat/ small grain or nonwheat/ nonsmall grain was 
present. 


The area proportion for each segment was then computed by 
taking the ratio of grid points identified as wheat/small grain 
to the total number of grid points. Then, for each county, 
the sampling variance (taken to be the estimate of the within- 
county variance) and c.v. of the wheat area estimate at the 
segment level within that county was computed. These c.v.'s 
and the wheat/small grain estimates from each of the foTxr states 
were then used to obtain an average segment percent wheat and 
C.V.; i.e., an estimate of the segment sampling error, c.v.g. 

The res\lLts are depicted in Table C-IV, 

TABLE C-IV.- ESTIMATE OP SAMPLING ERROB c.v.g 
AT THE SEGMENT LEVEL 


State 


Average wheat 
percent 


Segment level 
Estimate of c.v, 
percent 


Kansas 


North Dakota 


Nebraska 


South Dakota 


The nijmbers shown in Table C-IV represent the first attempt 
Within the project to compute sampling error. Some key issues 
can be noted. For example, when comparisons between analyst- 
derived wheat proportion estimates and SRS county results are 
made in Kansas , a consistent underestimate was apparent. 

However, since the errors of SRS estimates projected to the 
county level are unknown, no conclusions can be drawn immediately 
relative to possible bias. 



Consideration of the foregoing and observations made of the 

magnitudes of the estimated displayed in Table C-IV leads 

to the conclusion that the random component of sampling error, 

c.v. , appears to be on the order of the Uo percent figure per- 
b 

missible for a 2 percent national sample error. 

ESTIMATING THE CLASSIFICATION ERROR AT THE SEGMENT LEVEL 

The data obtained (Table C-IV) at the county level were used in 
a standard statistical analysis to compute a sampling c.v. at 
the state level for each of the four states. In addition, an 
estimate of the total c.v. (including the effects of classi- 
fication and sampling error) at the state level, c.v.^, was com- 
puted using the LACIE segment estimates and the SRS I969 census 
data at the county level. If it is assumed, as discussed in 

Al.O, that (c.v..) = (c.v. ) + (c-v.^,) , then it follows 

A 0 b 

immediately that an estimate of c.v.j^ at the state level can be 
obtained. 

By considering the number of samples in the state, an estimate 
of classification error at the segment level c.v.^ is obtained 
for each state and is depicted in Table C-V, 


TABLE C-V.- ESTIMATE OF -CLASSIFICATION ERROR 


c.v.g AT THE SEGMENT LEVEL 


State 

State level 

Segment level 


Estimated 

Estimated 

Estimated 

Estimated 


c.v.A. 

c.v.g. 

c.v.c. 

~c 


percent 

percent 

percent 


Kansas 

10 

6 

8 

59 

N. Dakota 

15 

13 

10 

65 

Nebraska 

39 

12 

37 

l6i 

S. Dakota 

20 

Ik 

l6 

• 73 


Observation of Table C-V indicates the following: 

• In all states but Nebraska, the classification error at the 
state level is acceptable and is about equal to the sampling 
error at the state level, i.e., c.v.„ = c.v. . 

D 0 

• Classification error at the state level in Nebraska, known 
to result from confusion crops, indicates a potential problem. 

In addition, one can conclude from Tables C-IV and C-V that 
in North Dakota the observed relative difference does not 
appear to restilt from the random components of classification 
error, c.v. p, and sampling error, c.v.g. Thus, a systematic 
problem may exist within the allocation of the LACIE North 
Dakota segments . 

In Table C-VI are presented the results of an independent 
estimate of the classification and sampling error using the 
blind site data. The c.v.^ is computed from the differences 
in the LACIE and ground truth proportion estimates 













(Table C-IIl). The c*'v.g is computed from comparisons of the 
ground truth and SRS county figures in Table C-III. 

It should be noted that the sampling and classification errors 
determined by this method for North Dakota compare very fav- 
orably with errors shown in Table G-V, thus establishing some 
agreement among the various approximate methods utilized to 
compute sample and classification errors. 

TABLE C-7I.- BLIND SITE ESTIMATES OF SAMPLING AND 
CLASSIFICATION ERROR AT THE STATE LEVEL 


State 

Estimates 

Estimated 


C.V.g, 

c.v.c, 


percent 

percent 

North Dakota 

10 

10 


SUMMARY 


It appears that the LAGIE area estimates over the Great Plains, 
can with a reasonable expectation, be a satisfactory component 
of a 90/90 production estimator. The area estimator produced 
more accTirate area estimates for the total winter wheat region 
than for the mixed spring and winter wheat region of the 
northern Great Plains. The major problem in the spring/winter 
states appears to be North Dakota. Detailed tests indicate 
that sample error is the source of the problem. Phase I 
comparisons of LAGIE estimates with groimd truth indicates that 
the LAGIE classification technology is working acceptably well. 
The accuracy does appear to degrade somewhat in regions of 



marginal agriculture where there are small fields and abundant 
confusion crops. However, it would appear that these regions 
tend also to be marginal with respect to wheat production and 
thus increased area estimation errors do not greatly influence 
the overall production estimation accTxracy in the United 
States, The loss of segments resulting from cloud cover 
appears to be a random phenomenon that introduces no significant 
bias into the estimates. This loss does increase the variance 
of the estimates. 
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